Skip to content

dbencic/reversing-quicktest

Repository files navigation

#Quick python test

Please use python 3+

1st task, word counter

Implementation can be found in wordcounter.py

Usage

For example, run by invoking python wordcounter.py -s wordcounter_test.txt where source (-s) is file from which to read words.

python wordcounter.py --h gives you stratup options. Together with -s parameter it is possible to define -o parameter to specify file to which result will be written, for example python wordcounter.py -s wordcounter_test.txt -o out.txt .

Final notes

Reading of unicode files and words is not properly tested and probably should be improved.

In terms of separation of concerns wordcounter.py could be improved so that reading words is separated from words processing. This would make more testable code, which could then plug-in different data source. But this would mean looping once more through list. Since for this specifical case input is expected to be file, I've chosen to optimise for speed.

2nd task, paragraph cleaner

Implementation can be found in pcleaner.py

It is implemented as specified: clens only paragraph containing only white spaces.

Paragraphs with attributres (ex: <p style="color: red">) containing only white spaces are cleared also.

See comments in method clean() for different cleaning policies (if you want to clean paragraphs containing tabs and newlines also, for example)

Usage

For example, run by invoking python pcleaner.py -s pcleaner_test.html where source (-s) can be file or http / https URL .

python pcleaner.py --h gives you stratup options. Together with -s parameter it is possible to define -o parameter to specify file to which result will be written, for example python pcleaner.py -s http://www.google.com -o out.html .

Testing

file test_pcleaner.py contains some common test cases

Final notes

Handling of source(file, URL) data coding can be improved further. It can give codec errors if data coding is not specified in response header (if datacoding header is not specified expects utf-8 content as input).

3th task, caching function return value

Implementation can be found in cache_decorator.py

Cache expiry time and number of hits are hardcoded as requested by task, but decorator can be changed in order to pass those two parameters as arguments.

Time triggered eviction is done by one timer thread for every cached item, what simplifies the code, but may not be appropriate for large number of cached items. Anyway, evicting items after timer expires releases memory in case when some cached items are rarely hit. For real world scenario time based cache eviction policy can be rewritten differently.

Invoking/Testing

To test eviction based on number of times function was invoked, run test test_cache_decorator.py

For time based eviction test scenarios, run test test_cache_decorator_time.py . The reason behind separating time based eviction to different test case is that it takes 10 minutes to run.

Final notes

If you want to see more detailed log messages, change log configuration in cache_decorator.py to logging.basicConfig(level=logging.DEBUG)

About

ReversingLabs quick python test

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published