The strange case of a failing adstrtal.sh script…

I ran into an interesting problem on an E-Business Suite R12 test system recently: the adstrtal.sh script wasn't working. It reported success upon exit, but it didn't actually start anything. The adstrtal.log file did not show any errors, but it also did not show the expected calls to the Apps listener, OPMN, and Concurrent manager control scripts. When I ran the individual control scripts (adalnctl.sh, adopmnctl.sh, adcmctl.sh), however, the respective services started without issue.

So what the heck was going on?

Here's what else I knew about this situation:

  • Production did not exhibit this problem
  • One other test system did not exhibit this problem
  • Both test systems were clones of production
  • All three systems, not surprisingly, were at slightly different patch levels
  • The non-working system was a more recent clone, and as such had a patch level closer to that of production.
  • The adstpall.sh script was also broken on the non-working system, and functioning properly on the others.

For brevity's sake, I'll refer to the non-working system as TEST1, and the working test system as TEST2. I'll also assume that most people know what PROD means. ;-)

When I looked at adstrtal.log on TEST1, what I saw at the end of the file was:

[Service Control Execution Report]
The report format is:
  <Service Group>  <Service>  <Script>         <Status>

  Other Services                     Disabled

ServiceControl is exiting with status 0

Considering that this is a single-node system, I was expecting to see a bit more than that. :-) The same log file on TEST2 read (with slight edit to obfuscate the instance name and host):

[Service Control Execution Report]
The report format is:
  <Service Group>            <Service>                                   <Script>         <Status>

  Root Service                                                                            Enabled
  Root Service               Oracle Process Manager for CONTEXT_NAME  adopmnctl.sh     Started
  Web Entry Point Services                                                                Enabled
  Web Entry Point Services   Oracle HTTP Server CONTEXT_NAME          adapcctl.sh      Started
  Web Entry Point Services   OracleTNSListenerAPPS_CONTEXT_NAME       adalnctl.sh      Started
  Web Application Services                                                                Enabled
  Web Application Services   OACORE OC4J Instance CONTEXT_NAME        adoacorectl.sh   Failed
  Web Application Services   FORMS OC4J Instance CONTEXT_NAME         adformsctl.sh    Started
  Web Application Services   OAFM OC4J Instance CONTEXT_NAME          adoafmctl.sh     Started
  Batch Processing Services                                                               Enabled
  Batch Processing Services  OracleConcMgrCONTEXT_NAME                adcmctl.sh       Started
  Batch Processing Services  Oracle Fulfillment Server CONTEXT_NAME   jtffmctl.sh      Started
  Other Services                                                                          Enabled

Since adstrtal.sh reads the context file to determine which service control scripts to run, my first thought was that the context file on TEST1 had become corrupted. Recreating the context file on with AutoConfig, however, did not fix the problem. The next thing to verify were the AD and Autoconfig patch levels. Unfortunately, the results were inconclusive: AD was the same on all three systems, and TEST2 was missing a techstack patch (6145693) that was installed on TEST1 and PROD. Before going to the trouble of further crawling through the AD_BUGS table or doing three-way comparisons of patchsets.sh output, I decided to take a look at versions of the files involved, to see if I could find something to help me target my search. What I found was interesting, if not particularly enlightening:

Version of: On system:
PROD TEST1 TEST2
adautocfg.sh 120.2 120.2 120.2
adstrtal.sh 120.13.12000000.3 120.13.12000000.3 120.13.12000000.3
Context file (adxmlctx.tmp) 120.217.12000000.43 120.217.12000000.48 120.217.12000000.48

Aha! Everyone has the same version of the adautoconfig.sh and adstrtal.sh scripts, but the context file versions are different between production and the test systems! Oh, wait, except adstrtal.sh works on PROD and TEST2, and not on TEST1. Drat.

One last gasp before the brute-force patch comparison approach. The section of adstrtal.sh that launches the individual service control scripts uses oracle.apps.ad.autoconfig.ServiceControl to parse the context file. Checking the version of $JAVA_TOP/oracle/apps/ad/autoconfig/ServiceControl.class and adding it to the above table shows:

Version of: On system:
PROD TEST1 TEST2
adautocfg.sh 120.2 120.2 120.2
adstrtal.sh 120.13.12000000.3 120.13.12000000.3 120.13.12000000.3
Context file (adxmlctx.tmp) 120.217.12000000.43 120.217.12000000.48 120.217.12000000.48
ServiceControl.class 120.3.12000000.2 120.3.12000000.2 120.3.12000000.4

And there's the real "aha!" moment. It appears that version 120.3.12000000.2 of ServiceControl.class does not parse a version 120.217.12000000.48 context file correctly. Sure enough, when I copied the ServiceControl.class file from TEST2 to TEST1 (not generally recommended, but these are test systems, and it was a weekend), the adstrtal.sh and adstpall.sh scripts on TEST1 worked exactly as expected.

Wrap-up

I did a little bit more digging, and here's what I found before I stopped: It looks like the culprit was one of the patches applied in support of the Applications Management Pack for E-Business Suite (yeah, you thought I was done talking about Grid Control? Me too...) Both Patch 6776948 and Patch 6874927 make changes to a number of files related to cloning and clone context file creation. This likely explains the difference in context file versions between PROD and TEST1.

And why did I stop digging? Simple. Both of these patches deliver files that are superseded by the latest R12 techstack patches, 7237313 and 7237006. While I could probably make a case that I'd found a bug, and submit a small pile of supporting evidence to Oracle Support, the fact that I know both a workaround (use the individual components scripts) and a solution (apply latest techstack patches) make the likelihood of getting a bug fix pretty remote. In the end, it's probably a better use of my time, and Support's, to just apply the latest tech stack patches to make my problem go away.

References

For checking up on your own system's patch levels:
Metalink: How to determine if you are on the latest Autoconfig related patches.
Metalink: Oracle Applications Current Patchset Comparison Utility - patchsets.sh

Discussions on the value of keeping patches up to date:
The inimitable Steven Chan at Oracle: Top 5 Myths About Patching Apps Environments
James Morrow at the Triora group: How Often Should I Apply Patches?

4 Comments

  1. Vladimir
    Posted 16 July 2009 at 10:17 | Permalink

    Excelent job!!!

  2. SundarK
    Posted 2 April 2010 at 3:08 | Permalink

    Nice Work!!!

  3. bhushan
    Posted 21 December 2011 at 10:15 | Permalink

    nice job on NOT following up on it!!!

  4. Pankaj
    Posted 27 December 2012 at 0:40 | Permalink

    Hi,
    I am facing the same problem that adstrtal.sh is not bringing up concurrent manager. when i go with adcmctl.sh it starts with no further issue.
    I checked the version of all files you mentioned in this forum.
    all are same with outher instance which is not having this problem.
    Please help!!

Post a Comment

Your email is never published nor shared. Required fields are marked *

*
*