tracker issue : CF-4200009

select a category, or use search below
(searches all categories and all time range)
Title:

CF should track the state of last execution of tasks in a file separate from neo-cron.xml, to prevent loss of all tasks on its corruption

| View in Tracker

Status/Resolution/Reason: Closed/Withdrawn/NotABug

Reporter/Name(from Bugbase): Charlie Arehart / Charlie Arehart ()

Created: 10/12/2017

Components: Scheduler

Versions: 2016

Failure Type: Others

Found In Build/Fixed In Build: /

Priority/Frequency: Normal /

Locale/System: /

Vote Count: 8

As many reading this will know, there can be cases where CF crashes and users lose their scheduled tasks, because the neo-cron.xml file becomes corrupted. And it seems that this happens because the file is being written to at the time of the crash. 

And the odds of such a write during a crash raise because right now, the file holds both the DEFINITION of each task, and the the STATE OF LAST EXECUTION of each task.

It would seem that Adobe could very easily alleviate this problem by having a separate file to track each kind of data,  It's like good normalization of a database. :-)

Attachments:

Comments:

We have made some core changes which prevents any NEO file from getting corrupted. Basically it checks for available resources before reading/writing these files. Hence , this feature is not required.
Comment by Suchika S.
27538 | April 16, 2018 12:07:12 PM GMT
I am running CF2016 (2016.0.03.300466) and have had this happen to me twice in the last two days. We had to force our server to reboot twice and each time, we were left with a blank neo-cron.xml file and had to restore it from backup. So, it's still happening in that scenario.
Comment by James S.
29826 | October 23, 2018 04:46:33 PM GMT
James, you show running only CF2016 update 3 there. The latest is update 7. Your point would be stronger if you could get that update in place and then report if it still happens. Even so, Suchika, I'm bummed to hear your reply. Regardless of what "checks" you may do, there's no logical reason (to an outsider) that the neo-cron.xml should hold BOTH the definition AND the tracking of EXECUTION of each task, other than that "that's how it's always been done". This seems a classic case of a need to "normalize" what is essentially a database, where you have two different kinds of things tracked in one "table" (the neo-cron.xml). Why not create a new second one for tracking EXECUTIONS? That would further diminish the risk of corruption, since then the one holding the definitions would be FAR less likely to be written to. Sure, there would still be risk of corruption to the new file that would track executions, but I'm sure most would far prefer to risk that (if your fix proved not to be as reliable as you hope), since that info on executions could at least be discovered if scheduled task logging is enabled. Would you please remove the withdrawn state until you can answer this response? (Somehow I never got any notification of your comment when you posted it in April. I did see the notification of James new comment today, which brought me back here.)
Comment by Charlie A.
29827 | October 23, 2018 05:01:54 PM GMT
I completely agree with Charlie on this. We are running CF2016 with update 7 and just had this happen so I'm not sure what Suchika's comments are based upon. Makes no sense to keep updating this file for every execution. There's bound to be either contention or, worse yet, a fatal error (like a server crash) in the middle of a file write. Putting this in the database would be much more logical.
Comment by Mike P.
30093 | January 07, 2019 05:55:09 PM GMT
Suchika, or someone at Adobe, PLEASE respond to this. it should NOT have been "withdrawn". The problem remains--indeed, it's taken on new importance in light of the recent Feb 2019 CF updates (to 2018, 2016, and 11) which in some cases messed up people's neo-cron.xml file, so the single-generation bak files (as I describe above) is yet another reason people are suffering due to that update bug: they rarely HAVE a good bak file to recover from. Suchika said above in Apr 2018 that changes had been made that "checks for available resources before reading/writing these files". That's good to hear, but it doesn't change the fact that because the task mechanism OVERWRITES the single-generation bak, folks quickly lose any good backup of the tasks. Bottom line: info about the DEFINITION of the tasks should be in ONE file, and info about the EXECUTION of tasks should be in another. That would at least ensure people have at least ONE good backup, until they (or CF itself) changes the tasks. Better still would be for CF to keep multiple generations of this file--indeed of all the neo*.xml files. It would just be a useful fail-safe, and they are tiny files in nearly all cases! Save 10 of them by default, and offer an admin option to control that if you want. But we can't keep going on with this single-generation backup of the neo-cron.xml file, nor with it being overwritten on ANY execution of a task!
Comment by Charlie A.
30368 | February 21, 2019 04:31:11 PM GMT
Why not keep the last 10 versions of the neo- files or have a setting in CF Administrator to allow someone to set the number of versions they want to keep? default it to 3, but allow a user to set up to say 100? If this was in place this would save a bunch of headaches....
Comment by John C.
30369 | February 21, 2019 04:42:44 PM GMT
Another vote here in support of Charlie's comments. I'd add however, that keeping multiple generations of the files doesn't necessarily make the problem go away. Many of us have lots of scheduled tasks and if the neo-cron.xml file gets regenerated and a new .bak file created at every execution of a scheduled task, the limit of 10 (or even 100 for that matter) won't help much as those would get overwritten pretty quickly. Just store the execution information for each task in the DB (or a file, I don't care) and leave the neo-cron.xml file alone until a scheduled task gets newly created or modified. We've now had this happen twice since Update 7 (once in production and once in a Dev environment). The fact that Update 8 specifically tells you to back up your neo-cron.xml file before updating indicates that this issue has become even more important. I think we would all like to understand the rationale for not doing what would seem like a relatively simple thing that would save potential headaches for customers.
Comment by Mike P.
30370 | February 21, 2019 04:43:01 PM GMT
John and Mike, thanks for your thoughts, but to be clear, it seems you both missed the points I was making in my last comments before yours today--or maybe you were only reading the bug note, not the comments. Anyway, I had just said in comments (written 10 mins before each of yours) that: "Bottom line: info about the DEFINITION of the tasks should be in ONE file, and info about the EXECUTION of tasks should be in another. That would at least ensure people have at least ONE good backup, until they (or CF itself) changes the tasks. Better still would be for CF to keep multiple generations of this file--indeed of all the neo*.xml files."
Comment by Charlie A.
30371 | February 21, 2019 05:05:23 PM GMT
Charlie, totally agree that this "data" should be normalized, and backed up with a multi-generation mechanism. How about submitting a new ticket to bring it to their attention, just in case Adobe staff don't monitor comments on closed tickets. Everyone could then vote on that one to highlight the importance/strength of feeling of this issue to the development team.
Comment by Michael C.
30372 | February 21, 2019 05:45:18 PM GMT
Thanks Charlie - understood but my comment probably wasn't clear. I completely agree with the separation of the EXECUTION from the DEFINITION of scheduled tasks. Tonight, I just updated one of our instances to Update 8 and, as suggested in the release notes, backed up the neo-cron.xml file. After upgrading, while CF was still stopped, I copied the neo-cron.xml back into the cfusion\lib folder, restarted CF, and it gets wiped out. Am I missing something as far as that is concerned?
Comment by Mike P.
30374 | February 22, 2019 03:55:37 AM GMT
+1 - Mixing the schema w/ activity allows the current resource-independent data loss, necessitating the requested fix. Could status please be changed from "Closed/Withdrawn/NotABug" to "Open/To Track/HaveNewInfo"?
Vote by Aaron N.
30381 | February 23, 2019 09:47:50 AM GMT
Please change the status. This require a fixe.
Vote by Alexandre D.
30396 | February 26, 2019 02:32:22 PM GMT
Mike, sorry I missed that last comment of yours on the 22nd. It's been a couple of very busy weeks with the 3 recent updates, one after another each week. So to be clear, I understood that you were saying that making multiple backups alone was not the solution, for the scenario you had stated just before my last comment. I was just clarifying that if indeed Adobe DID separate out the definition and execution, then the multiple backups WOULD be valuable. I see now that you understood that. It just wasn't clear from your previous reply. As for your next issue, it sounds like you simply got bit by the bug that was in CF2016 update 8. That was fixed in update 9 (or would also be in the update 10, released on Mar 1). Can you confirm if either of those has been applied and if you are still having the problem? All that said, I still look for Adobe to address whether they will change this feature request/bug report from "withdrawn" to open. The suggestion to split out the tracking of execution and definition of tasks (as I requested originally here in 2017) is all the more important given the recent debacle of the Feb 12 updates to CF11 and 2016, causing loss of tasks. (No one from Adobe has ever explained why they didn't happen to the same update that day for CF2018.) But again, while the later updates on Feb 25 and Mar 1 DO resolve that issue (of losing tasks), it doesn't negate the original problem and request I am making here.
Comment by Charlie A.
30427 | March 04, 2019 04:07:12 PM GMT
Charlie, agree with everything you are saying. If the definition and execution were split out, multiple backups would be just fine. And yes, I can confirm that Update 9 resolved the scheduled task issue. Haven't even touched Update 10 at this time.
Comment by Mike P.
30428 | March 04, 2019 05:28:02 PM GMT
Thanks, Mike. As for update 10, I realize you may be gun shy after the debacle of update 8. But to be clear, update 10 only adds a security fix (which Adobe deems urgent), so there should be no risk of the kind of problems update 8 did (which changed many things, in addition to adding its own other security fixes for previous vulns). If it may help to know more, I did a blog post on it Friday Mar 1, the day that update was released: https://www.carehart.org/blog/client/index.cfm/2019/3/1/urgent_CF_security_update_Part_1 But this is getting astray of the point of the tracker entry here, so let's leave it at that. I will say (for those following along) that I did ping some Adobe folks privately today to see if we can get someone to revisit this entry. If they don't, then as Michael C suggested on the 21st, I may need to create a new ticket instead to get their attention to this. The comments here are helpful for them to see, though, so let's hope they may respond here instead.
Comment by Charlie A.
30429 | March 04, 2019 05:57:04 PM GMT
Hello everyone , We fixed a bug in CF 2018 update 2 & CF 2016 Update 8 . As a part of the fix for this bug , we don't store the lastfire time after each run of task in the neo-cron.xml. Instead we use quartz's api to fetch last run. The bug is CF-4171358 for your reference. Thanks, Suchika
Comment by Suchika S.
30434 | March 05, 2019 09:38:18 AM GMT
Suchika, thanks, and that would be great for lastfire, but the neo-cron.xml still tracks nextfire, which changes on each scheduled execution, for a task set to repeat. That would not be addressed by your plan to obtain lastfire from quartz somehow. More curious, I am finding that lastfire IS still tracked, for a task I created today, with cf2018 update 3 installed. To be clear, I notice these do NOT change if the task is run manually, but only when run as *scheduled*, if that may be how you were testing things. Finally, even if all that were resolved, it still doesn't address the request to track more than one generation of backup. For instance, I find that a backup is made whenever cf is *restarted*. And of course, any change to a task creates a backup. And the point is that each backup overwrite loses the one generation backup that cf currently keeps, increasing the chance that ANY problem with the file (including simply potential corruption on a CF crash) will lead folks to quickly have no good backup to restore to.
Comment by Charlie A.
30438 | March 05, 2019 03:55:47 PM GMT
Hi Charlie, We do not store nextfire in neo-cron.xml, it is also fetched from the quartz api. Whenever the scheduler page in CF administrator is refreshed , it calls a listAPI that writes to neo-cron.xml file with the lastfire & nextfire values. So  a task that runs every 5 minutes will not update lastfire & nextfire in the neo-cron.xml after every 5 minutes. But if you refresh the scheduler page in admin the latest lastfire & nextfire values will be written to neo-cron.xml. We are aware that lastfire is not updated when the task is run manually but this is a smaller bug than neo-cron.xml being corrupted. We are trying to fix this. Please raise a bug for one more generation of backup . We will look into this. Thanks, Suchika
Comment by Suchika S.
30439 | March 06, 2019 09:47:01 AM GMT
Sorry for the delay in responding. So Suchika, you are agreeing at least that something CAN cause the lastfire and.or nextfire values to appear in the neo-cron.xml. And that means that information about the execution of the tasks (rather than just definition OF the tasks) is indeed stored in the neo-cron.xml--which is the very thing I'm arguing against here. Indeed, you say that whenever the scheduled task page is refreshed such values would change or are stored. I don't see that (I can refresh over and over and never see the file change. But even if you do, do you see how that alone would be wrong, in terms of my bug report/feature request here? I'm arguing basically that unless someone changes the task (in the Admin or using cfschedule or the admin api), the neo-cron.xml file should not change. Any information about when the task did lastfire (or it state), or when some repeating task is to next fire, should be stored somewhere OTHER than this file. It's just like basic database table normalization, as I said in the last paragraph of my original entry here. One wouldn't (typically) store last purchase info in a table about products. You would have a purchases table, and there would be a relationship defined between the product table and the purchases table, with the productid as a foreign key in the purchases table. I realize that analogy breaks down a bit, because you don't keep track of ALL fires, only the next or last one. But the concept is the same: they're different kinds of data. And again, people expect that the DEFINITION of the tasks should be PROTECTED (such that if it never or rarely changes, the file does not change--and the one backup is not lost), but your own discussion shows that the table COULD change far more often than when the definition does. Now, you may say "well, it only happens when the task list is refreshed", which is a user's choice. Again, I don't see that happening at all, but if it ever DID happen as you say, then that alone would also be reason to seek that nextfire/lastfile info be stored elsewhere. But instead, I will say that I see the nextfire appearing in there whenever I run a task, with a set time. And if it's set to repeat, that changes each time it's run. Let me clarify: I opened the neo-cron.xml file in an editor that tells me if the file changed (notepad++, in my case, which prompts me to reload it when changed), and I setup a task that would run 3 times in a row, a minute apart. And between runs I watched the file in the editor, and I DID get prompted that the file was changing. More important, I DID see the nextfire value in the file between the runs. And now that I think of it, that was indeed a greenfield task (a new CF2018 install, in which I defined the task in the CF Admin UI, after update 3). Do you really think nextfire (or lastfire) should really be tracked in that file? Even when that risks a) corruption of the file (by increasing the odds it's being written during a crash) or b) loss of the .bak file (since each such change overwrites that bak file)?
Comment by Charlie A.
30548 | March 21, 2019 02:26:54 AM GMT
Before I ever saw this issue here, I looked at neo-cron.xml and was surprised to see the file date changing nearly continuously. Viewing the XML, it’s clear the file is tracking each execution of every task. While most of our tasks run a few times a day or perhaps once a week, we have a set of 10 scheduled tasks that each run every 3 minutes. No wonder neo-cron.xml keeps changing! My first thought was that this behavior was odd, making tracking of task configuration changes very difficult – or more precisely, making it very difficult to even know if task configurations had changed when compared to file backups. I completely support separating task DEFINITIONS from EXECUTION states into separate stores. No matter how much pre-checking CF does to expect that a file update should work, there is no way to be certain that it won’t fail and result in a corrupted file. Please re-open this issue and fix this issue. It’s the right thing to do.
Comment by John D.
30554 | March 21, 2019 03:26:25 PM GMT
Adobe folks?
Comment by Charlie A.
30678 | May 01, 2019 02:43:05 PM GMT
(CF 2016,0,13,316217) I realize this issue/thread has been dormant for quite a while now, and if the discussion has moved to the comments section of another issue, please could someone point me to it and I will post there. I noticed a change in behaviour in schedule task lastfire logging, and I'm curious to know what others think of the change. I noticed that: * neo-cron.xml is not getting written to constantly like it used to (a good thing) * when the ColdFusion service is restarted, the value of 'lastfire' for all tasks is shown in the Administrator as 'NOT RUN' which would suggest that the lastfire values are stored in memory for the duration of the CF service uptime, and then just discarded on restart, rather than being saved to a file. It's great that neo-cron.xml is no longer being used as a data store, but shouldn't those times be stored SOMEWHERE??
Comment by Michael C.
33213 | February 28, 2020 11:15:55 PM GMT