Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Import sometimes hangs, or does not import all entries #18

Closed
amityweb opened this issue Sep 17, 2015 · 12 comments
Closed

Import sometimes hangs, or does not import all entries #18

amityweb opened this issue Sep 17, 2015 · 12 comments

Comments

@amityweb
Copy link

I have 546 entries to import. Everytime I import it it hangs. Usually it imports less than 200.

This then gets stuck in the process notice in the top right. If I import again, they are stuck in Pending, as the first process never completes. I have restore the database from a backup to start again.

FeedMe logs are empty, with no errors shown.

If I use the direct link to import, I get Internal Server 500 Errors.

Initially I thought it may be due to a low memory server, so I increased it from 1Gb to 4Gb. I still get the issues.

The error log repots this:

[Thu Sep 17 11:19:16.852241 2015] [fcgid:warn] [pid 6617] [client xx.xx.xx.xx:55379] mod_fcgid: read data timeout in 40 seconds, referer: http://mydomain/admin/feedme/feeds
[Thu Sep 17 11:19:16.852320 2015] [core:error] [pid 6617] [client xx.xx.xx.xx:55379] End of script output before headers: index.php, referer: http://mydomain.co.uk/admin/feedme/feeds

I increased the timeout to 90 seconds, but get the same error (but for 90 seconds)

I increased the timeout to Unlimited and once it finished but did not import all entries, nor are there any errors, and the second time I tried its hung again, but still no errors.

Still nothing in the Logs tag though. I'd expect to see some success status for each import and errors on any that did not import for any reason so we can fix them.

So maybe its a good idea to break it down into batches to import to keep each process time down below common timeout limits? And to offer a Stop process icon in the status dropdown in case of suck processes.

I am going to have to try to split my file into 5 different ones and create 5 different imports.

Thanks

@amityweb
Copy link
Author

I have split the data into files of 100 only. I imported the first file, and of 100, it imported only 90. No errors, no messages to show me why and which ones were not imported.

@engram-design
Copy link
Contributor

I'm not sure this is a memory issue, and I haven't seen those specific errors turn up before, especially with reference to mod_fcgid. To give you some background, the plugin does exactly this - split processing up into lots of 100 feed items for processing. Each batch of 100 is processed as a sub-task of the parent task. Craft's Tasks service actually does an amazing job at keeping memory consumption down, so you really should never have issues there.

It would be great to have a stop process-style function, but because we're using Craft's Tasks service, the plugin is at the mercy of what is being provided. The only way to cancel a Task is to remove the row in the tasks database table.

Would you be able to submit an issue through the Help tab on the Feed Me page? This will provide use with some further information on your feed, which might help get to the bottom of things...

Edit: Ah - just noticed you did send a help request 😄

@amityweb
Copy link
Author

So there are three issues...

First one is a timeout issue. Increasing the timeout to higher than what it takes works. So unlimited timeout works.

The second issue was we were importing fields into a web address field which requires http:// on the front, and our data did not have http:// so the imports were failing. So by a process of elimination over several hours we discovered the bad data. FeedMe should really have logging to show success/error messages for each import and highlight reasons why it was not imported (if thats possible).

The third issue is unresolved, I have no idea what the problem is, but for some reason it is still hanging on import, and we do not get any errors to show the reason why.

I am still missing one entry from each file except the first and last file though, so I am missing 4 entries.

We've almost got it working now though, by increasing timeout, and then adding http:// to the front of all web address fields, and then importing the entries in multiple files containing 100 records each.

Update: it is now 12:43pm and the logs do now show 3 entries, one of which shows Import Groups 1: {"groupWebsite":["Website must be a valid link."]} from 9/17/2015 11:13 AM. This was not showing earlier. In fact I just ran a successful import and that is not in the logs. So I wonder if there is some sort of delay for it showing in the logs, or just some logs not being added. Strange.

@engram-design
Copy link
Contributor

You may want to switch off the backup option to see if that speeds things up. Depending on how big your database is, this will cause a severe lag in starting your feed processing.

Those sorts of error messages should definitely be shown, as they're validation issues when saving the entry. I can see you're using SproutLinkField_Link. Its very possible that your other issues are causing these errors not to appear, and they should be shown in the Logs tab.

Log items are flushed to the disk at the end of the task. This is most likely causing issues when running into timeout, so I'll look at flushing this at the end of every batch.

@engram-design
Copy link
Contributor

A few extra notes. I've noticed you're using the Delete duplication option - this will delete all entries for the chosen section. This might take some time depending on the number of entries in that section.

There was also a minor bug when the Delete duplication option - successfully imported entries would never show up in your log. This has been fixed in the latest release.

@amityweb
Copy link
Author

I am using Add Entries now. I tried all 3 options. I also disabled backup, should have said sorry. So still same issues.

The 4 remaining missing entries are email links, they were malformed, so not imported like the web address issue. But there were no log of it, so its a tedious task to find them.

All imported now though! Took about 5 hours due to troubleshooting the above. I still import in batches, it still hangs when all 546 are imported in one go. No timeout errors. It also sometimes hangs when using batches if I click Run Task one after the other to add to the queue. If I wait till one finishes before another it seems to work OK.

@amityweb
Copy link
Author

Log of entries showing now. Would it log in production AND dev? We've been switching between the two. We're in dev mode now. I notice the status circle sometimes doesn't appear and imports dont run in the background, that may have been in production mode. It could be related to logs not recording?

Anyway, all done now. Hopefully the above issues will help improve it. First time using this plugin, its been a bit of a nightmare to be honest!

Thanks

@engram-design
Copy link
Contributor

Yep, it'll log whether devMode is on or off, but it'll overall run a lot faster with devMode off. But as I mentioned, the previous release didn't log successful imports when Delete was selected.

Sometimes the status indicator doesn't appear right away until you refresh the page. Pretty sure this is a Craft issue, but the task is definitely running as soon as you hit the Run Task button.

Sorry you've had issues using the plugin, and appreciate the feedback. We'll take it onboard and hopefully get to the bottom of what was causing the issues!

Thanks for your patience and information through your issues.

@mdxprograms
Copy link

What is the best way of stopping the task if it hangs?

@engram-design
Copy link
Contributor

Following the guide here - http://buildwithcraft.com/help/stuck-tasks or have a look at Task Manager from @boboldehampsink.

@mdxprograms
Copy link

thanks @engram-design

@engram-design
Copy link
Contributor

Closing for now. Please let me know if you need this reopened.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants