tracker issue : CF-3556864

select a category, or use search below
(searches all categories and all time range)
Title:

Jakarta isapi_redirector Intermittent Service Temporary Unavailable Errors

| View in Tracker

Status/Resolution/Reason: Closed/Fixed/

Reporter/Name(from Bugbase): Jake Hand / Jake Hand (Jake Hand)

Created: 05/08/2013

Components: Installation/Config, Connector

Versions: 10.0

Failure Type:

Found In Build/Fixed In Build: Final /

Priority/Frequency: Major / Most users will encounter

Locale/System: English / Win 2008 Server R2 64 bit

Vote Count: 2

Problem Description: In a multi-tenant Windows Server 2008/IIS 7.5 environment (with separate application pools for each site), ColdFusion 10 intermittently displays the following error:
    Service Temporary Unavailable!

    The server is temporarily unable to service your request due to maintenance downtime or capacity problems. Please try again later.

    Jakarta/ISAPI/isapi_redirector/1.2.32 ()

Steps to Reproduce: This intermittent issue affects some sites on a server, and when it occurs you only have to try browsing to the affected site to experience the issue. It will take forever for the site to load, and it will eventually just show the error above.

Actual Result: Instead of the site loading properly, the Jakarta error above is displayed.

Expected Result: The site's page should load without error.

Any Workarounds: 
     When the error occurs the only solution to get all sites working again is to reset CF and IIS.

     Because the CF team mentioned that the IIS-CF connector needs to be tuned based on the server's load, we have spent a lot of time tuning the connector settings for our environment. First we tried increasing the "max_reuse_connections" attribute for the connector in its workers.properties file. This would delay the occurrence of the error, but it only provided temporary relief. 
     Then we tried enabling the "connection_pool_timeout" setting as recommended by the Adobe ColdFusion team's blog. This too would delay the error, but no matter what type of timeout was used the error would eventually come back, as if threads were not being timed-out by the connector. We've tried timeouts short (as low as 30 seconds) and long (as long as the 10 minutes recommended by Tomcat's Timeouts How-To guide) with no real success. Eventually the error always appears. When the error occurs on a site our JVM heap is in great shape and our performance monitor (Seefusion) shows no hung requests queuing up. 
     Lastly, we've tried also setting the "connection_pool_size" and the "connection_pool_size" already provided by Tomcat's web server connector. These too seem to provide temporary relief, but no combination of any of the above settings seems to keep CF stable long-term. Additionally, because there is almost no documentation on the "max_reuse_connections" attribute (which seems specific to CF 10) it is not evident how that setting relates to the "connection_pool_size" setting.

----------------------------- Additional Watson Details -----------------------------

Watson Bug ID:	3556864

External Customer Info:
External Company:  
External Customer Name: jakefusion
External Customer Email:  
External Test Config: My Hardware and Environment details:

- Windows Server 2008 64-bit

- IIS 7.5

- ColdFusion Enterprise 64-bit "Standalone" installation

Attachments:

Comments:

QUESTION: Is this error logged? Is the only way to detect this to monitor every hosted website to identify whether it returns a successful "200 OK"?
Comment by External U.
15454 | May 08, 2013 08:09:20 PM GMT
Can you please let us know the details of the values you have set for the max_reuse_connections,connection_pool_size, the values of maxThreads in connector block which looks like "<Connector port="8012" protocol="AJP/1.3" redirectPort="8445" tomcatAuthentication="false" maxThreads="300" />" in server.xml present in cfusion\runtime\conf\.Also your JVM config settings. Can you also let us know about the load details on the server when this issue occurs ? Please email the details to asha@adobe.com .Let me know your email id too .Thanks!
Comment by Asha K.
15455 | May 09, 2013 12:04:18 AM GMT
For more information about this error we need to look in to the connector logs at ColdFusion10\config\wsconfig\<folder for relevant connector> .
Comment by Asha K.
15456 | May 14, 2013 02:57:57 AM GMT
We are experiencing the same issue. Based on the isapi_redirect.log we have tracked it down to the connector (1.2.32). Adobe needs to update the tomcat connector. I checked the change log and the bug was fixed in 1.2.35. http://tomcat.apache.org/connectors-doc/miscellaneous/changelog.html fix HTTPD: Fix crash on unknown worker names. (mturk) fix IIS: Fix crash on worker process recycle. (mturk) fix 52659: IIS: Fix shared memory corruption. (mturk) fix 52921: HTTPD: Fix crash in uri mapping. (mturk)
Vote by External U.
15461 | October 16, 2013 08:52:14 AM GMT
We are seeing this on one of our production servers. The problem just started happening recently and the server has been running CF10 for nearly a year. No code has been modified in months and the only way to fix it is to restart IIS. In the past 3 days this occurred more than 8 times. Is there any fix planned?
Vote by External U.
15462 | July 14, 2014 10:23:07 AM GMT
Trying to delve into the issue based on data provided by the customer. If this issue reoccurs, please contact me at inoel@adobe.com
Comment by Immanuel N.
15457 | September 08, 2014 11:38:35 PM GMT
@Jake, A series of application pool failures were fixed in HF 14. We believe the issue you are referring to, would not exist anymore, since it drills down to hung application pool connection. Can you please confirm if HF 14 solves issues for you?
Comment by Immanuel N.
15458 | December 05, 2014 04:00:54 AM GMT
Will be closing the bug since all open IIS application pool crash related issues were fixed in HF14. Will reopen the bug if this issue still persists for the customer.
Comment by Immanuel N.
15459 | December 09, 2014 03:47:23 AM GMT
Thanks Immanuel, the update did fix the intermittent issues we were seeing. Thanks!
Comment by External U.
15460 | December 09, 2014 10:41:55 AM GMT