Skip to content
This repository has been archived by the owner on Mar 9, 2021. It is now read-only.

Unable to Crawl At All #96

Closed
Gho57X90 opened this issue Jun 22, 2017 · 14 comments
Closed

Unable to Crawl At All #96

Gho57X90 opened this issue Jun 22, 2017 · 14 comments

Comments

@Gho57X90
Copy link

I've been attempting in any way I could think of ever since I found this program for the two to three updates to try and get the program to crawl blogs. It seemingly accepts tumblr URLs just fine and where I've set the program to download to seems to be ok too, but whenever I hit the 'Crawl' button the program just... Sits there.

I've even sat the program there for a few hours upwards of 3-5 hours and still it does nothing. Authentication doesn't seem to affect anything, (unless there's more to that than simply logging in to your tumblr account like any user would), and deleting the appdata settings and starting fresh hasn't helped either.

These are my connection settings for the application, if that helps at all.
tumblthree_2017-06-22_14-35-20

Between failed efforts and a bunch of Googling, I'm feeling rather frustrated that the one crawler that seems like it does what others fail to do doesn't even crawl, I'm feeling kinda desperate at this point, so any help would be greatly appreciated.

@johanneszab
Copy link
Owner

The exact same settings work for me here. I suppose you've ticked a checkbox for the post kind you want to download of your blog in the details window? Like it's shown on the front page (e.g. Download images)?

Windows version, any proxy or VPN in use?

Otherwise I probably cannot really help you since downloading here works here with the exact same settings and any tips would be just guessing.

@Gho57X90
Copy link
Author

Gho57X90 commented Jun 22, 2017

I am using Windows 8.1 Pro [build 9600], (my copy is Windows 8 Pro with the 8.1 update from the Windows Store, if that helps any), and I'm not using any VPN or proxy when attempting to download.

I will disclose that I have CyberGhost 6 installed, but I don't actively run it since it's only used as a bypass for country filters on YouTube, and even then that's once every two moons if at all.

This is what I have ticked as far as what posts i try to download are concerned. I assume these are the default settings?
tumblthree_2017-06-22_15-01-18

@Gho57X90
Copy link
Author

Oh, it might help to mention I'm running off an Intel Core 2 Quad Q9650, 8 GB of DDR2 and a Gigabyte EP43-US3L motherboard. I dunno how relevant any of that is, but I know that sometimes running on older hardware can cause problems, so yeah.

@johanneszab
Copy link
Owner

johanneszab commented Jun 22, 2017

It's not hardware related and every windows version above Windows XP (with .NET 4.5) should be working fine.

As I've already said, I cannot provide you any more useful help without error/debug messages. You might want to install Visual Studio, download the code and debug it, but I agree that this is kinda overkill. There might be some information in the Event Viewer, but I highly doubt it.

If I were you, I'd try it with a as straight as possible connection to the internet. Do you have any Windows around without all this crap installed? If thats working, then there is at least a hint.

But I'm not sure if the connection is the reason at all. Another guess is that you might not be able to access the tumblr api. You could try this version here: #33

Let other people here know what you did if you could figure it out.

@Gho57X90
Copy link
Author

Hmm... I think I can make a virtual machine and see if the program works inside that. I can also try installing Visual Studio and debugging the code, but yeah, would prefer to not do that if I don't have to. Considering how many artists' blogs I try to keep up with though, (in addition to other solutions that I'm aware of being in disrepair or simply not doing the fullest of their job ever), I'm kinda really needing this program to work, so I'm willing to do whatever within my abilities to get it working.

I'll comment again when I've had a look into the VM and/or the debugging-in-Visual-Studio method.

@Gho57X90
Copy link
Author

Ok, I've attempted to do the Visual Studio method in light of my VM not working at this current time. When attempting to build the sollution in order to debug, this is what the Output tab spat out.

1>------ Build started: Project: TumblThree.Presentation, Configuration: Debug Any CPU ------
1>C:\Program Files (x86)\Microsoft Visual Studio\2017\Community\MSBuild\15.0\Bin\Microsoft.Common.CurrentVersion.targets(1964,5): warning MSB3245: Could not resolve this reference. Could not locate the assembly "Microsoft.Expression.Interactions, Version=4.5.0.0, Culture=neutral, PublicKeyToken=31bf3856ad364e35, processorArchitecture=MSIL". Check to make sure the assembly exists on disk. If this reference is required by your code, you may get compilation errors.
1>C:\Program Files (x86)\Microsoft Visual Studio\2017\Community\MSBuild\15.0\Bin\Microsoft.Common.CurrentVersion.targets(1964,5): warning MSB3245: Could not resolve this reference. Could not locate the assembly "System.Windows.Interactivity, Version=4.5.0.0, Culture=neutral, PublicKeyToken=31bf3856ad364e35, processorArchitecture=MSIL". Check to make sure the assembly exists on disk. If this reference is required by your code, you may get compilation errors.
1>J:\TumblThree-master\TumblThree-master\src\TumblThree\TumblThree.Presentation\Views\FullScreenMediaView.xaml(48,18): error MC3074: The tag 'Interaction.Triggers' does not exist in XML namespace 'http://schemas.microsoft.com/expression/2010/interactivity'. Line 48 Position 18.
1>J:\TumblThree-master\TumblThree-master\src\TumblThree\TumblThree.Presentation\Views\DetailsView.xaml(706,26): error MC3074: The tag 'Interaction.Triggers' does not exist in XML namespace 'http://schemas.microsoft.com/expression/2010/interactivity'. Line 706 Position 26.
========== Build: 0 succeeded, 1 failed, 4 up-to-date, 0 skipped ==========

Just in case it's useful 'cause I'm not familiar enough with Visual Studio to totally understand what's useful or not, here's the Error List also came up with.
devenv_2017-06-23_22-00-08

To note also, since my last comment I've been trying to run the program on my HP Elitebook 8440p laptop, of which uses an Intel Core i5 580M and is also running Windows 8.1 Pro, but installed straight from 8.1 Pro installation media rather than the 8.1 upgrade. Same results despite completely different hardware.

Is any of this helpful to you? If not, anything else I can do on my end to figure out what's wrong?

@johanneszab
Copy link
Owner

Looks like those two microsoft .dlls are part of the Blend for Visual Studio SDK for .NET under individual components in the VS 2017 Community installer.

Either you can rerun the setup from the control panel -> Programs and Features -> VS 2017 -> Change or you might try to install the .dlls on using a NuGET package within Visual Studio itself as its described in this stackoverflow question.

Using the installer has the advantage that the version number will match your visual studio, where as download a random dll might not work if there is a version mismatch. The missing .dlls are also part of the TumblThree.zip from the release page.

Thanks for testing all this out. I'll update the instructions accordingly to make it easier for the next one to test this out.

That .dll is used for the preview to switch between the movie and image control depending on the input.

@Gho57X90
Copy link
Author

Ah, there we go. Installed the required component to run the application via the VS setup thingy. It appears to have worked since I was able to run the source code just fine after that.

Now, this time I was able to get the Debug Output stuff from Visual Studio.

'TumblThree.exe' (CLR v4.0.30319: DefaultDomain): Loaded 'C:\WINDOWS\Microsoft.Net\assembly\GAC_64\mscorlib\v4.0_4.0.0.0__b77a5c561934e089\mscorlib.dll'. Skipped loading symbols. Module is optimized and the debugger option 'Just My Code' is enabled.
'TumblThree.exe' (CLR v4.0.30319: DefaultDomain): Loaded 'J:\TumblThree-master\TumblThree-master\src\TumblThree\TumblThree.Presentation\bin\Debug\TumblThree.exe'. Symbols loaded.
'TumblThree.exe' (CLR v4.0.30319: TumblThree.exe): Loaded 'C:\WINDOWS\Microsoft.Net\assembly\GAC_MSIL\PresentationFramework\v4.0_4.0.0.0__31bf3856ad364e35\PresentationFramework.dll'. Skipped loading symbols. Module is optimized and the debugger option 'Just My Code' is enabled.
'TumblThree.exe' (CLR v4.0.30319: TumblThree.exe): Loaded 'C:\WINDOWS\Microsoft.Net\assembly\GAC_MSIL\WindowsBase\v4.0_4.0.0.0__31bf3856ad364e35\WindowsBase.dll'. Skipped loading symbols. Module is optimized and the debugger option 'Just My Code' is enabled.
'TumblThree.exe' (CLR v4.0.30319: TumblThree.exe): Loaded 'C:\WINDOWS\Microsoft.Net\assembly\GAC_MSIL\System.Core\v4.0_4.0.0.0__b77a5c561934e089\System.Core.dll'. Skipped loading symbols. Module is optimized and the debugger option 'Just My Code' is enabled.
'TumblThree.exe' (CLR v4.0.30319: TumblThree.exe): Loaded 'C:\WINDOWS\Microsoft.Net\assembly\GAC_MSIL\System\v4.0_4.0.0.0__b77a5c561934e089\System.dll'. Skipped loading symbols. Module is optimized and the debugger option 'Just My Code' is enabled.
'TumblThree.exe' (CLR v4.0.30319: TumblThree.exe): Loaded 'C:\WINDOWS\Microsoft.Net\assembly\GAC_64\PresentationCore\v4.0_4.0.0.0__31bf3856ad364e35\PresentationCore.dll'. Skipped loading symbols. Module is optimized and the debugger option 'Just My Code' is enabled.
'TumblThree.exe' (CLR v4.0.30319: TumblThree.exe): Loaded 'C:\WINDOWS\Microsoft.Net\assembly\GAC_MSIL\System.Xaml\v4.0_4.0.0.0__b77a5c561934e089\System.Xaml.dll'. Skipped loading symbols. Module is optimized and the debugger option 'Just My Code' is enabled.
Step into: Stepping over non-user code 'TumblThree.Presentation.App..ctor'
'TumblThree.exe' (CLR v4.0.30319: TumblThree.exe): Loaded 'C:\WINDOWS\Microsoft.Net\assembly\GAC_MSIL\System.Configuration\v4.0_4.0.0.0__b03f5f7f11d50a3a\System.Configuration.dll'. Skipped loading symbols. Module is optimized and the debugger option 'Just My Code' is enabled.
'TumblThree.exe' (CLR v4.0.30319: TumblThree.exe): Loaded 'C:\WINDOWS\Microsoft.Net\assembly\GAC_MSIL\System.Xml\v4.0_4.0.0.0__b77a5c561934e089\System.Xml.dll'. Skipped loading symbols. Module is optimized and the debugger option 'Just My Code' is enabled.
Step into: Stepping over non-user code 'TumblThree.Presentation.App.InitializeComponent'
'TumblThree.exe' (CLR v4.0.30319: TumblThree.exe): Loaded 'C:\WINDOWS\Microsoft.Net\assembly\GAC_MSIL\PresentationFramework.Aero2\v4.0_4.0.0.0__31bf3856ad364e35\PresentationFramework.Aero2.dll'. Skipped loading symbols. Module is optimized and the debugger option 'Just My Code' is enabled.
'TumblThree.exe' (CLR v4.0.30319: TumblThree.exe): Loaded 'C:\WINDOWS\Microsoft.Net\assembly\GAC_MSIL\System.ComponentModel.Composition\v4.0_4.0.0.0__b77a5c561934e089\System.ComponentModel.Composition.dll'. Skipped loading symbols. Module is optimized and the debugger option 'Just My Code' is enabled.
'TumblThree.exe' (CLR v4.0.30319: TumblThree.exe): Loaded 'J:\TumblThree-master\TumblThree-master\src\TumblThree\TumblThree.Presentation\bin\Debug\WpfApplicationFramework.dll'. Symbols loaded.
'TumblThree.exe' (CLR v4.0.30319: TumblThree.exe): Loaded 'J:\TumblThree-master\TumblThree-master\src\TumblThree\TumblThree.Presentation\bin\Debug\TumblThree.Applications.dll'. Symbols loaded.
'TumblThree.exe' (CLR v4.0.30319: TumblThree.exe): Loaded 'C:\WINDOWS\Microsoft.Net\assembly\GAC_MSIL\System.Runtime.Serialization\v4.0_4.0.0.0__b77a5c561934e089\System.Runtime.Serialization.dll'. Skipped loading symbols. Module is optimized and the debugger option 'Just My Code' is enabled.
'TumblThree.exe' (CLR v4.0.30319: TumblThree.exe): Loaded 'J:\TumblThree-master\TumblThree-master\src\TumblThree\TumblThree.Presentation\bin\Debug\TumblThree.Domain.dll'. Symbols loaded.
'TumblThree.exe' (CLR v4.0.30319: TumblThree.exe): Loaded 'C:\WINDOWS\Microsoft.Net\assembly\GAC_MSIL\System.Xml.Linq\v4.0_4.0.0.0__b77a5c561934e089\System.Xml.Linq.dll'. Skipped loading symbols. Module is optimized and the debugger option 'Just My Code' is enabled.
'TumblThree.exe' (CLR v4.0.30319: TumblThree.exe): Loaded 'J:\TumblThree-master\TumblThree-master\src\TumblThree\TumblThree.Presentation\bin\Debug\Guava.RateLimiter.dll'. Symbols loaded.
'TumblThree.exe' (CLR v4.0.30319: TumblThree.exe): Loaded 'C:\WINDOWS\Microsoft.Net\assembly\GAC_MSIL\SMDiagnostics\v4.0_4.0.0.0__b77a5c561934e089\SMDiagnostics.dll'. Skipped loading symbols. Module is optimized and the debugger option 'Just My Code' is enabled.
'TumblThree.exe' (CLR v4.0.30319: TumblThree.exe): Loaded 'C:\WINDOWS\Microsoft.Net\assembly\GAC_MSIL\System.ServiceModel.Internals\v4.0_4.0.0.0__31bf3856ad364e35\System.ServiceModel.Internals.dll'. Skipped loading symbols. Module is optimized and the debugger option 'Just My Code' is enabled.
'TumblThree.exe' (CLR v4.0.30319: TumblThree.exe): Loaded 'MetadataViewProxies_7572ad7b-8a66-4dfd-a895-57d6790ccca1'.
13:21:42.259 > ManagerController.LoadLibrary:Start
13:21:42.267 > ManagerController:GetFilesCore Start
13:21:42.473 > ManagerController.GetFilesCore End
'TumblThree.exe' (CLR v4.0.30319: TumblThree.exe): Loaded 'J:\TumblThree-master\TumblThree-master\src\TumblThree\TumblThree.Presentation\bin\Debug\System.Windows.Interactivity.dll'. Skipped loading symbols. Module is optimized and the debugger option 'Just My Code' is enabled.
'TumblThree.exe' (CLR v4.0.30319: TumblThree.exe): Loaded 'J:\TumblThree-master\TumblThree-master\src\TumblThree\TumblThree.Presentation\bin\Debug\Microsoft.Expression.Interactions.dll'. Skipped loading symbols. Module is optimized and the debugger option 'Just My Code' is enabled.
'TumblThree.exe' (CLR v4.0.30319: TumblThree.exe): Loaded 'C:\WINDOWS\Microsoft.Net\assembly\GAC_MSIL\PresentationFramework-SystemXmlLinq\v4.0_4.0.0.0__b77a5c561934e089\PresentationFramework-SystemXmlLinq.dll'. Skipped loading symbols. Module is optimized and the debugger option 'Just My Code' is enabled.
'TumblThree.exe' (CLR v4.0.30319: TumblThree.exe): Loaded 'C:\WINDOWS\Microsoft.Net\assembly\GAC_MSIL\PresentationFramework-SystemXml\v4.0_4.0.0.0__b77a5c561934e089\PresentationFramework-SystemXml.dll'. Skipped loading symbols. Module is optimized and the debugger option 'Just My Code' is enabled.
'TumblThree.exe' (CLR v4.0.30319: TumblThree.exe): Loaded 'C:\Users\derp\AppData\Local\Temp\VisualStudio.XamlDiagnostics.12460\Microsoft.VisualStudio.DesignTools.WpfTap.dll'. Skipped loading symbols. Module is optimized and the debugger option 'Just My Code' is enabled.
'TumblThree.exe' (CLR v4.0.30319: TumblThree.exe): Loaded 'C:\WINDOWS\Microsoft.Net\assembly\GAC_MSIL\UIAutomationTypes\v4.0_4.0.0.0__31bf3856ad364e35\UIAutomationTypes.dll'. Skipped loading symbols. Module is optimized and the debugger option 'Just My Code' is enabled.
'TumblThree.exe' (CLR v4.0.30319: TumblThree.exe): Loaded 'C:\WINDOWS\Microsoft.Net\assembly\GAC_MSIL\UIAutomationProvider\v4.0_4.0.0.0__31bf3856ad364e35\UIAutomationProvider.dll'. Skipped loading symbols. Module is optimized and the debugger option 'Just My Code' is enabled.
13:21:44.451 > ManagerController.LoadLibrary:End
The thread 0x3188 has exited with code 0 (0x0).
The thread 0x17fc has exited with code 0 (0x0).

Exception thrown: 'System.Threading.Tasks.TaskCanceledException' in mscorlib.dll
Exception thrown: 'System.Threading.Tasks.TaskCanceledException' in mscorlib.dll
Exception thrown: 'System.Threading.Tasks.TaskCanceledException' in mscorlib.dll

The thread 0x4c has exited with code 0 (0x0).
The thread 0x2a78 has exited with code 0 (0x0).
The program '[12460] TumblThree.exe' has exited with code 0 (0x0).

I noticed that the bolded text regarding the two threads popped up as the program was attempting to crawl the blogs, (which interestingly enough, used the settings in appdata made by the compiled version of the application, dunno if that's normal or not).
As for the exceptions that are marked in italic text, those were thrown up when I hit the 'Stop Crawling' button. Both these things seem to be occurring in both the attempts I've made.

As for anything else, and I think this is a Visual Studio issue...
devenv_2017-06-24_13-44-19

Not really sure what's happening here since my attempts at Googling to find out have failed.

@johanneszab
Copy link
Owner

johanneszab commented Jun 24, 2017

The important messages are these:

13:21:44.451 > ManagerController.LoadLibrary:End
The thread 0x3188 has exited with code 0 (0x0).
The thread 0x17fc has exited with code 0 (0x0).
Exception thrown: 'System.Threading.Tasks.TaskCanceledException' in mscorlib.dll

but there is nothing wrong with them. The LoadLibrary:End is a debug message which I've added to control the exit of the LoadLibrary method. Since TumblThree is now properly async code there are a lot of threads which start and do things concurrently. One of them maybe loaded the library. An exit code of 0 usually means success.
The TaskCanceledException is also wanted to shutdown the crawl.

So, you might have to set debug points next to the code lines numbers at various points. You can start with the Crawl Task in the CrawlerController of the Applications assembly and go further down from there inwards. That's the command used to start the Crawl (i.e. hitting the Crawl button on the user interface).

Edit: I've never seen the messages on the screenshot. It's also helpful to always check the exception window not just the Debug window. Clicking exceptions within the exception window usually brings you directly to the right spot.

@Gho57X90
Copy link
Author

Frustratingly, I wasn't able to find any exceptions in any tests I've run besides the TaskCanceledException thingy you indicated as a normal exception. Additionally, I'm doing all this with little understanding of the actual code, as I don't have any experience in C# or any of the other things that TumblThree uses to function, so it's entirely possible that I'm missing something that would be clearer to someone else.

I'm attempting again to try and get a virtual machine working with Windows 8.1 and Windows 7 respectively to see if it's a problem with the operating system my computers are using. For now, I'm stuck with the only theory being that Windows 8 and TumblThree simply don't like each other.

That said, the other crawler I was using, referred to simply as Tumblr Image Downloader, had stopped working for me too, (which is what prompted me to find this program to begin with,) and I think it works a similar way that TumblThree currently does. Maybe there was an update or a missing dependency on my computers running Windows 8 that knocked out functionality that the two require to function?

@johanneszab
Copy link
Owner

Maybe there was an update or a missing dependency on my computers running Windows 8 that knocked out functionality that the two require to function?

I don't think so. Since Tumblr Image Downloader is written in Java (and not C# as TumblThree) and all other internet related things seems to work, I cannot think of any way that would be possible.

The blog in the manager (left side) is actually shown as Online in the Status row of TumblThree? If so, than TumblThree could already connect one the the tumblr api. The scanning is bascially doing the exact same thing, just with different urls.

If it's shown as offline, did you actually try the v.1.0.5.16 release?

@johanneszab
Copy link
Owner

I'll close this but if you have any more questions, I'll still try to answer them.

@Spednsteve
Copy link

I would have loved to give a positive review. However, this app would not function at all.
I reloaded the app twice. First time used the "more stable versions(.63) and then 2nd time your latest version (.68) with absolutely no success. Only thing I could do was as follows:
successfully authenticate login
adjust some settings although also tried leaving at default settings.
paste Tumblr blog URL into URL field.
**Clicking the "add" button did not function, did not add the blog URL at all. Most of the buttonology was not working...grayed out
Wasted alot of time tonight on this app. Obviously many have used this app successfully. I also see alot of user comments who could not get it to work. Maybe I'm missing something or just alot of users taxing the app today to download their Tumblr blogs. You really should pull this app off until it is retested, stabilized and instructions are simplified.

@Gho57X90
Copy link
Author

Gho57X90 commented Dec 17, 2018

I would have loved to give a positive review. However, this app would not function at all.
I reloaded the app twice. First time used the "more stable versions(.63) and then 2nd time your latest version (.68) with absolutely no success. Only thing I could do was as follows:
successfully authenticate login
adjust some settings although also tried leaving at default settings.
paste Tumblr blog URL into URL field.
**Clicking the "add" button did not function, did not add the blog URL at all. Most of the buttonology was not working...grayed out
Wasted alot of time tonight on this app. Obviously many have used this app successfully. I also see alot of user comments who could not get it to work. Maybe I'm missing something or just alot of users taxing the app today to download their Tumblr blogs. You really should pull this app off until it is retested, stabilized and instructions are simplified.

Just to contribute my two cents; I'm not having much trouble with the application myself, currently. I've been using it non-stop since tumblr's announcement of their policy change on NSFW/Adult content on their site, with only one or two specific blogs making the app crash when attempting to crawl. While it doesn't automatically add links anymore for whatever reason, (maybe I've not read something in the changelogs?), all the buttons in the main window work perfectly fine for me as of version 1.0.8.65 with .NET framework 4.0.30319.42000 64 Bit.

All things considered, I think I've basically inadvertently stress-tested it with little to no problems with a roster of somewhere over a thousand blogs. As such, I for one have to give it a positive response.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants