Hung threads - ltm side -unable access employee space

I know this may be hard to comprehend but wanted to try.

We went live in May 16. LTM and S3 payroll apps (a few others as well)

Employees/Managers go to html5 page/url on the LTM side

We are federated and that then redirects automatically to S3 url side for authentication

Once in July and then CONSISTENTLY starting in November, we have several downtimes just on employee/manager space and external job board... all of which use webserver on LTM side

We are working diligently with support/AMS. But not making much headway. Just that hung threads seem to be the issue.. no way of knowing what /why they are hanging.

We recently added 20 gb and adjusted heap on LmrkListBatch.. since then, we made it almost 2 weeks without issue but then had it go down Sunday and Monday both for about 20 minutes each time and resolved on its own.

90% of the time, it resolves on its own.. 20 minutes, then it's fine. Sometimes it's over an hour (we can see this after the fact as we have a 3 party do webchecks on our url to then report to us if it fails to launch successfully)

I have went through everything I can find to see if anything performed on our systems had the impact of when this consistently started in November.

The only 2 things I can find are application of Microsoft server patches the end of October and the implementation/use of Proxy roles to allow Coordinators to do things on behalf of Managers. Late November we updated the environments and application levels. We are on v 10 with Windows servers and sql db.. IBM WebSphere. AMS has added ram/adjusted heap twice. THis consistently occurs at night but also occurs at random during the day.

I'm not sure that anyone can really help here since we are working with Support, AMS and Development for months but thought it was worth a try to see if anyone has any insight.

Thanks in advance for any feedback.

Find more posts tagged with

Comments

Legacy Contributor

We are on premis and have had the same problem as you. We did all the things you did to try and fix the problem. We ended up scheduling a reboot of both our Landmark and LSF server every two weeks. That has resolved the problem for the most part, but every once in awhile we still get hung threads. Especially when managers are doing performance evaluations. Wish I could offer a better solution. Good luck.

Legacy Contributor

First question I was going to ask was, What is your reboot schedule. Looks like JimY has already begun to answer that question. Since our new 10x configuration is very similar to yours, may I share our experience: We ran our 10x system from January until June in "test" mode, meaning, it was running parallel to our old 9.0.1 system. That gave us some long-term experience. We noticed strangeness and problems with the system, especially security, after about 10 days of continuous running.

We insisted upon our migration consulting team writing us updated stop/start scripts. Then I set up a system Stop - Wait - Start sequence once (1 times) a week.

We have not experienced trouble since.

Legacy Contributor

Hi Annette,

We are on-premise with UNIX/AIX, and we see this situation regularly. We have found no solution other than a Websphere restart. In our case, even that won't resolve the problem and we will have to do a full system restart to resolve it.

We're on LMRK 10.1.1.19 and LSF 9.0.1 and we are federated with LMRK being primary.

I wish we had a better answer.

Kelly

Legacy Contributor

By the way, we do a cold backup weekly, so the entire system is restarted. This is fine as long as the restart is done in the correct order (i.e. LMRK must be up-and-running before LSF is started). We are under Infor Managed Servicies, and unfortunately sometimes it doesn't happen this way.

Legacy Contributor

Thanks everyone!

No one has ever recommended weekly/bi weekly restarts of environments or websphere itself.

When this was first happening, support kept restarting entire environment but then I challenged if just restarting websphere would work since was just web related 'stuff' and that's much quicker.. so they did start doing that to resolve. Sometimes that didn't work though and had to do full restart.

Then after setting up our 3rd party monitoring, we saw this was occurring most every night within 8-9pm but resolving on it's own. Then I found a scheduled maintenance job in async (manager org structure things) that normally runs a minute or so was running long and the other one I had scheduled to start 8:20 so then both were running same time and maxing ListBatch thing.

So I moved the 8:20 to 9pm start and that didn't seem to be an issue anymore but then still have 'lock ups'.

We are also on premise, but have AMS to help with our support through our implementation of phase 2 for supply chain side.

I greatly appreciate this group. Everyone has been very helpful when I reach out.

Thanks again!

lrjurgen

We are on HP-UX for our LSF environment and of course Windows for Landmark. We had been seeing hung threads back on version 9, years ago, so we implemented nightly restarts which seems to have cleared this up for the most part and I am told that websphere was the culprit.

We stop, backup and restart EVERY night right around midnight for LSF, websphere and Landmark. We have scripts on the servers, but basically:
1. Landmark goes down, websphere goes down, LSF goes down
2. Backups are run
3. LSF comes up, websphere comes up and then Landmark comes up

Legacy Contributor

We are on AIX for LSF and Landmark with LSF being the primary authenticator. We are experiencing Hung Threads several times a week. We currently have a weekly restart of the entire Landmark system. LSF is recycled nightly. When we have hung threads and all web applications are hosed, we recycle Websphere and sometimes have to kill Websphere processes left hung there after the stop finishes, or that are preventing the stop from completing. Then We get a clean startup of Websphere and all is good again.

Can I ask what 3rd party monitoring you are using?

Legacy Contributor

Any chance you guys have any of these Maintenance jobs on a schedule? I feel they may have some impact but not sure.

I put them on a schedule so HR didn't have to manually run them every day. These went on schedule end of October and then 3 weeks later we started having issues.. no clue if related but thought I'd ask

CreateEmployeeSubordinateRecords

UpdateKeywords

BuildOrgUnitWorkAssignmentSet

RebuildActorOrgUnit

GenerateSupervisorChart

BuildSupervisorEmployeeSet

Copyright © 2025 Infor. All rights reserved.