Zeroing in on UniData Problems
02 Sep 2004
What do you do when customers call you up and present you with odd problems occurring on their Rocket Software UniData systems? Where do you start trying to figure out what part of the system is misbehaving?
Your first question to your customer (of
course) is: "What changed?"
The most common answer to that question is: "Nothing!"
Here are a few follow-on questions that you can use to stimulate your customer's memory and generate a more meaningful dialog:
- Did you upgrade your operating system or install a new operating system patch?
- Did you upgrade your UniData database release?
- Did you change any application programs or install a new release of your application software?
- Did you recompile or recatalog a program?
- Did you modify a dictionary item?
- Did you create or modify a TRIGGER on a file?
- Did you create or build a new alternate index on a file?
- Did you stop and restart the UniData daemons / services?
- This could result in activating a previously modified udtconfig file.
- Did you experience a system crash and reboot?
- Did your administrator 'clean up' files, symbolic links or logs at the operating system level?
- Did you install new hardware? Network card, hub, routers?
- Did your administrator tighten up permissions at the operating system level?
- Did you add more users to the system?
- Did you increase the system load by adding new reports or batch jobs?
- Did your administrator relocate UniData files to new file systems?
- Did you install any new third-party applications on your system?
- On Windows, this may have overwritten a common Windows dll.
- Did your Windows system become infected with a virus?
These are common problems, and the answers are fairly obvious. You may need to fix permissions, deal with a virus, and so on.
But what if the questions above don't provide any clues? It may be time to connect to your customer's system and check the UniData installation. It only takes a few minutes. Even if you don't see any anomalies, it is useful to make sure you have a good, clean installation as a solid base for further problem isolation. These items are described in detail over the next few pages of this article, but here is the list:
- Do you have the right UniData software for your operating system?
- Are all of the UniData daemons or services active?
- When did the daemons or services last start?
- Is the installation clean? Do you have a mixture of executables from two different UniData releases?
- Are you using the standard udt executable? Or do you have C routines linked in?
- Do you have concurrent versions of UniData active? If so, do you have environment variable or PATH problems?
- Are there any errors in the UniData error logs?
- Is your udtconfig file in good state? Does it match in-use values?
- Is it only a problem for one user?
- Is the problem only happening in one account?
Let's look at these questions one by one, and consider how your customer's answers to these questions can help you solve their problems.Back to top
- First determine the version of the operating system you are
On most UNIX systems
uname -awill reveal the operating system release. Some exceptions:
SCO Open Server:
Windows: Right click My Computer, select Properties
- Then determine the operating system that the UniData version you
are running was ported to.
In the UniData bin directory, there is an ASCII file named 'port.note'. Use
cat, more, pgor whatever operating system command you are comfortable with to view it. For example:
# cat $UDTBIN/port.note Platform : AIX 4.3.3 Operating System: AIX mustang 3 4 00027D2F4C00 Porting Date : Aug 27 2003 UniData Release: 6.0.8 60_030826_4193 Ported by : srcman
The UniData ECL command
WHAT, also displays this information. For example:
:WHAT Hardware : IBM Operating system: AIX O.S. version : 4.3.3 UniData version: 6.0.8 Restore command: tar xvf /dev/... Product Serial Number: 54321
The last line of the output of the UniData ECL command VERSION also displays the UniData release number. For this example, the string displayed is '608'
If the operating system you are running matches what UniData was ported to, you are all set.
If the operating system is older, you have a problem. Rocket Software never guarantees backward compatibility to an older release of the operating system than what UniData was ported to.
If your operating system version is newer, you need to check the Product Availability Matrix on the Web to confirm that Rocket Software officially supports this combination (UniData release and your operating system release). This is part of the U2TechConnect Web site. Continuing with the UniData 6.0.8 example above, the matrix generated for AIX and UniData indicates that UniData release 6.0.8 is supported on AIX 4.3.3, AIX 5.1 and AIX 5.2.
On UNIX systems, run
$UDTBIN/showud. The screen shot in Figure
1 represents a standard installation, without the Recoverable File
System (RFS) or Replication enabled.
Figure 2 shows the
showud output with RFS and Replication enabled. You can
see the additional daemons that are launched. Depending on how your
system is configured, you may have additional aimglog, bimglog or udpud
For Windows, review the services from the control panel. For instance, for UniData 6.0.7 running on Windows 2000, click on Start -> Settings -> Control Panel -> Administrative Tools -> Services:
Figure 3 shows the expected UniData services, starting with UniRPC Service through UniData Terminal Server 6.0.
You can use these screen shots as a starting place, but you need to know what layered products your customer is running and how they are
configured to be absolutely certain all of the proper daemons are running.
If some daemon or service is missing, you will need to restart it.
For UNIX systems, ask all users to exit -- if
they can. Then run
$UDTBIN/stopud, followed by
For Windows systems, you can try to restart the service from the control panel, but it may be cleanest to ask users to exit and reboot the system.
In order to determine why a daemon or service is missing, you should review the $UDTBIN error logs. For a system with RFS enabled, review the sm.log as well. On Windows, the udt.log may have some information as well. On UNIX systems, also check to see if there is a current 'core' file in the $UDTBIN directory. On very rare occasions, a daemon may crash and produce a core file.
This is usually not a critical piece of information when looking at a UniData system, but it may give you a direction to pursue - particularly if UniData has been up for a very long time (maybe. months), or a very short period of time (for example if the customer has taken actions prior to contacting you).
There is an ASCII file in $UDTBIN named 'startud.log' that records this information.
Or, you can look at the date/time of any of the 'daemon.log' files that are created when the daemons startup. As 'smm' is always present, you can look at the 'smm.log' time/date.
On Windows systems, when the UniData services start, they create the 'daemon.log' files, but not the 'startud.log'. Review the date/time stamp of the 'smm.log' in the UniData bin directory with Windows Explorer.
If the database has been up for months, it might be reasonable to schedule some time to stop it and restart it. Whereas there are no known problems (such as memory leaks) that would suggest this as necessary, it can't hurt.
If the database has just been restarted (today, yesterday), you might want to ask the customer if they already restarted UniData to try to fix their problem. They may have also rebooted the system (restarting the database). If they plead ignorance, review the 'startud.log' history in the $UDTBIN/saved_logs directory. The log in that directory will contain information about the 20 previous invocations of 'startud'.
Is the installation clean? Do you have a mixture of executables from two different UniData releases?
On UNIX systems, it is possible to upgrade UniData while UniData processes are still running. This will result in unpredictable results, depending on how different the UniData versions are and which executables are still running the prior version.
One error that we have seen when launching a udt process that indicates this upgrade error is:
SMM's shared memory not
How can this problem occur during an upgrade?
UniData is installed from a tar image. When
using tar to extract files into a directory (such as $UDTBIN), most
UNIX systems will not overlay an active executable. If you have not
stopud prior to upgrading, tar will not
overlay the UniData daemons.
Another possibility is that you have an orphaned udt process still running -- even though the daemons are stopped. In this case, the udt executable in the $UDTBIN directory will be from the prior release.
UniData is shipped with the time/date stamp on the executables matching the date the product was ported to a platform. All of the dates for the daemons and the udt executable should match. We also provide debug builds of major executables (such as udt.d). These dates may be a day or two later than the normal, optimized executables, but are quite close. If you have put a debug executable in place, you will see this as a day or two later. This is normal.
Use the UNIX
ls command to sort
your $UDTBIN executables by date and look for a standard executable
that is months or years older than the rest of the current
ls -lt $UDTBIN | pg
If you detect a problem, you need to reinstall UniData.
On UNIX or Windows systems, you can link C routines into the udt executable that are run via the UniBasic CALLC function. C is a powerful language. Your C routines could change the behavior of standard UniData functions. When diagnosing a difficult problem, it is best to eliminate as many variables as possible. Rocket Software support may ask you to reproduce the problem you are reporting with a standard udt executable - to remove the possibility that your C routines have introduced the problem.
If you neglected to preserve the original udt executable, on UNIX systems we provide a udt.d version. This is compiled with debug flags and contains the entire symbol table, but still runs standard UniData code.
On Windows, if you did not preserve the original udt.exe prior to linking in C routines, you may need to reinstall UniData (or install on another system and move over the standard udt.exe for testing).
If you are unsure if you (or your customer) have linked in C routines, review the time/date stamp on the udt executable. All of the standard executables (such as smm, cleanupd, sbcs, sm, tm) should have the same time/date stamp. If you have linked in C routines, the time/date stamp for udt will be the date that you created your new udt executable. It will be a later date than the rest. You can always just check against the time/date stamp for smm.
# ls -l $UDTBIN/udt -r-xr-xr-x 2 277 sys 6868428 Feb 21 2003 /data/ud60/bin/udt # ls -l $UDTBIN/smm -r-xr-xr-x 1 277 sys 255079 Feb 21 2003 /data/ud60/bin/smm
This output indicates that you are running a standard udt executable.
The size of a udt executable that has C routines linked in will generally be somewhat larger than the standard executable, as well.
If you are not running a standard udt, try reproducing the reported problem with a standard, unaltered udt. This is easy to do on UNIX, as we provide udt.d (as noted above). This is not always possible, as the C routines may get invoked by your application in the steps necessary to reproduce the problem.
Do you have concurrent versions of UniData active? If so, do you have environment variable or PATH problems?
UniData supports concurrent installations of two or more major versions on the same system. This is useful for testing a new UniData release prior to upgrading your production system to a new version. This can lead to the same kinds of problems you see from mixed executables from two different versions of UniData in your UDTBIN directory. Typically, a user will set their UDTHOME and UDTBIN environment variables to point to the directories where one version of UniData is installed, but neglect to put that UDTBIN directory first in their PATH. The udt executable they launch is from the UDTBIN of one version, but the daemons it is interacting with (based on UDTBIN, UDTHOME environment variable settings) are from a different version.
How can you tell if there are multiple versions installed? On UNIX, look for multiple instances of the UniData shared memory manager daemon 'smm'. For example:
# ps -ef | grep smm root 28141 1 0 08:45:03 pts/ta 0:00 /disk1/ud52/bin/smm -t 60 root 4132 1 0 Nov 12 ? 0:04 /disk1/ud60/bin/smm -t 60 root 28165 28092 1 08:45:19 pts/ta 0:00 grep smm
You can't rely on the customer naming his directories appropriately for the version running from that UDTBIN directory, but you can always look at the port.note file in the directory displayed to see exactly what version of UniData is running.
On Windows systems, review the UniData services in from the control panel. Look for multiple instances of 'UniData Database Service n.n' - where 'n.n' could be 5.2 or 6.0 (for instance). In the example shown in Figure 5, there are three versions of UniData running.
Check your environment variables, including UDTBIN, UDTHOME and PATH. Make sure everything is synchronized. Be sure your UDTBIN is the first UniData bin directory in PATH.
On UNIX systems, from ECL, you can run: '!env | more' - checking each of the three variables.
On Windows systems, from ECL you can run: '!set UDTBIN', '!set UDTHOME', '!set PATH'
Errors encountered by UniData deamons/services or individual udt processes are recorded in error logs in the $UDTBIN directory. This is a great place to start when looking for problems on a UniData system. The pertinent error logs are:
- smm.errlog - Shared Memory Manager
Records authorization count down messages. Other than that, most of the errors here are during startud and smm initialization failure.
- sbcs.errlog - Shared Basic Code Server
Records requests to sbcs by udt processes for non-existant objects.
- cleanupd.errlog- Cleanup daemon
Records information about the cleanup of locks and license information for udt processes that have abnormally terminated.
- rm.errlog - Replication Manager (UniData 6.0 and later)
- udt.errlog - errors encountered by individual udt processes.
Includes file corruption errors and shared memory errors. This will also include file open errors at 6.0.7 or optionally at 6.0.8 if the environment variable UDTERRLOG_LEVEL is set to 2 or higher.
For RFS systems, the first place to look is:
- sm.log - System Manager.
Note that even though this file is named '.log' rather than '.errlog', this is where any RFS system errors are logged. This includes individual 'tm' process errors or 'sm' daemon errors.
On Windows systems, there is an additional file to check:
- udt.log - This records telnet and socket errors.
There are two engineering-level diagnosis error logs that may be created on your system, as well:
- udtsort.errlog - records errors encountered by the udtsort
- udtlatch.log - records information about physical lock table
internal structure access.
There is also a set of log files that record information about the operating system structures created when a UniData daemon launches (such as message queue, shared memory segments and semaphores). These are named:
The startud.log is useful in determining when the UniData daemons were last launched.
Review the logs and error logs in the UniData bin directory as described above.
If your customer has already stopped and restarted the database, these logs will have been refreshed. In order to review the error logs that were active when the problem occurred, you need to look in the saved_logs directory in $UDTBIN. The files in this directory contain the last 20 sets of logs -- with the most recent activity appended to the appropriate file and separated by a short line.
|Back to top|
On UNIX systems, your udtconfig file lives in the /usr/udnn/include directory -- where 'nn' represents the major point release of the version of UniData you are running. For example, for UniData 6.0.x, this directory is /usr/ud60/include. On Windows systems, this resides in the UDTHOME\include directory. For a default 6.0 installation this would be: c:\ibm\ud60\include. This can be updated by a number of tools, including: UniAdmin, udtconf , shmconf and systest -- but can also be manually modified with any text editor.
How can your udtconfig file get into a 'bad' state?
- Using the udtconfig file from a prior release when upgrading
From time to time, we add new parameters or restructure this file with a new release of UniData. If you upgrade your system, we advise you to let the upgrade process build a fresh, complete udtconfig file and then make individual tuning changes you need for your application to run properly with UniData. Please do not just copy your prior udtconfig, overlaying the new file created by the upgrade process. Your old copy may not have all of the parameters required to run the new version of UniData.
- Adjusting your UNIX kernel without adjusting your udtconfig
Some parameters need to be in sync with operating system kernel parameters. One primary one is that udtconfig SHM_MAX_SIZE needs to match the UNIX kernel shmmax setting. If you have tuned your kernel since installing UniData, you need to adjust udtconfig to match the new values.
How could your udtconfig file be out of sync with active in-use parameter settings?
When the UniData daemons startup, udtconfig parameters are loaded into shared memory. When a udt process starts, it derives most of it's udtconfig parameters from this information stored in shared memory -- rather than from the current udtconfig file. There are a few parameters that can be modified by setting environment variables prior to launching the udt session, but these are generally exceptions (for example TMP). If you have modified udtconfig, but not stopped and restarted the daemons, the active settings may not match the 'on disk' udtconfig file. When diagnosing a problem, it is imperative to know what parameters are active.
- Check to see if your udtconfig file is in a 'good' state. On UNIX
systems, use the shell-level
systestcommand to generate a default udtconfig file in a tmp directory. Then compare this to your active udtconfig. Review the differences and confirm that the differences are explainable as custom settings for your site.
$UDTBIN/systest -f newconfig('newconfig' can be any name you want - this file is a default udtconfig file)
diff newconfig /usr/ud60/include/udtconfig(use the UNIX
diffcommand to compare the default 'newconfig' and your existing udtconfig.
- How can you tell if your in-use values match your on-disk udtconfig
settings? On UNIX systems, use $UDTBIN/showconf to display in-use
values. Compare them to your 'on disk' values.
$UDTBIN/showconf > liveconfig('liveconfig' can be any name you choose)
diff liveconfig /usr/ud60/include/udtconfig
|Back to top|
Sometimes a problem is isolated to a particular user. Only that user experiences the problem. Other users or root/administrator users can run the same application with no errors. There are two primary areas to review to determine what is different about the problem user: file access permissions and environment variable settings.
File Access Permissions:
UniData files are all standard UNIX or Windows file system files. We rely on the file access permission schemes provided by the operating system to control reads and writes to UniData files. If the problem user is experiencing read or write errors, review how that user is setup at the operating system level with regards to group membership. Compare this to another user who is not having the problem. Examine the access permissions set on the files the problem process uses.
If the problem process uses SQL access to the database, also review the access for this user established in the PRIVILEGE file in the UniData account where the problem occurs.
Environment Variable Settings:
There are a number of environment variables that could be set differently for the problem user.
- Does PATH have the correct UDTBIN directory first in the list of directories?
- If UDTHOME is set differently, the problem user may be running a different version of a globally cataloged program.
- If TMP is pointing to a different directory, is that directory full? Are permissions OK on that directory?
Other things to check:
- Are UNIX stty settings different for the problem user?
- Is the problem user running with a different terminal type (for example wyse50 rather than vt100) and having display problems? The problem may be with the telnet client for that user.
File Access Permissions:
Use your operating system provided utilities for user maintenance to determine how file access is setup for the problem user. Compare that setup to a functioning user. Check the access permissions setup on the files the user is accessing. One easy way to determine if permissions are likely to be the problem is to run the same application logged in as 'root' or 'administrator'.
Environment Variable Settings:
From ECL in a UniData session, display the environment variables for the problem user and compare to a functioning user. Running this from within an active UniData session is important. If you just display settings from the shell (prior to launching a UniData session), you won't see the complete picture.
On UNIX systems, user-specific environment variables are generally set in a .login or .profile file in that user's home directory. This is often where stty settings are established as well.
On Windows systems: Right click on My Computer and select Properties -> Advanced ->, Environment Variables.Back to top
UniData applications are often set up with multiple accounts where UniData sessions are launched. This is done to group files or applications together (for example payroll or inventory) or to control user access to specific applications or subsets. Sometimes software development or demonstration accounts are established as well. Occasionally a problem will be reported as only happening when any user launches a UniData session in a specific account. The same application program runs fine when launched from a different account.
The most unique item in any UniData account is the VOC or 'vocabulary' file. You need to compare entries in the VOC of the problem account to those in a functioning account. Here are some specific items to consider:
- The LOGIN paragraph. If there is an item in the VOC named LOGIN,
that item is executed first when a user logs in. Review this for local
settings. Items often configured in a LOGIN paragraph include:
- UDT.OPTIONS are a series of switches that determine how a variety of UniBasic or ECL commands function.
- BASICTYPE determines how programs compiled in this account run - typically 'P'ick mode or 'U'nidata mode.
- ECLTYPE determines how UniQuery commands are parsed. Either U or P.
- SETPTR determines initial printer assignments.
- File pointers to program files or paragraph/proc files. Make sure that there aren't local copies of common programs, procs or paragraphs that behave differently on this problem account.
- Program catalog entries. If a UniBasic program is cataloged DIRECT or LOCAL, there is a catalog entry in the VOC of that program name that points to the object code to execute. Make sure common programs are pointing to the same place as on the 'good' account. If a program is cataloged globally, there will not be any entry at all in the VOC. Which program gets run is based on the UDTHOME environment variable setting.
We've looked at a number of common problems and solutions that you might encounter when administering UniData databases. If you'll begin your problem determination process by asking the questions we've considered in this article, you should be well on your way to solving your users' most common problems.
After nine years of developing, deploying and supporting application software for small manufacturing companies, in 1991 Wally Terhune joined the technical support team of a fledgling database company named Unidata. Since that time he has worked exclusively in the technical support group, wearing a variety of hats and in various senior positions including that of manager and director. Wally’s current role is that of U2 Support Architect. He works closely with customers and the U2 development team to get to the root of problems associated with running U2 (UniData and UniVerse) products. The UniData database continues to be his primary focus and passion.