-
Notifications
You must be signed in to change notification settings - Fork 1
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Remove debug() output commands from a lot of places. They are actually more confusing than helpful. Indexer class - Changed the default block interval from 512 to 2048 - Added the postponeWrite field. This comes into play, when an empty block occurs. Empty blocks will not be used for index entries anymore as they caused a lot of problems leading to faulty results. - Added a lot of debug code which can help a developer to better understand, how the index is created. Fields and methods are: - writeOutOfPartialDecompressedBlocks - storageForDecompressedBlocks - writeOutOfPartialDecompressedBlocks - storageForPartialDecompressedBlocks - partialBlockinfoStream - enableWriteOutOfDecompressedBlocksAndStatistics() - enableWriteOutOfPartialDecompressedBlocks() - Extracted the methods createIndexEntryFromBlockData(), storeDictionaryForEntry() and writeIndexEntryIfPossible() to make the code more readable. - Reworked finalizeProcessingForCurrentBlock() to make it more readable and also make it use and therefore output the above mentioned debug information. Extractor class - Added the roundtripBuffer field, which will be used later to enable: - error recognition, if the extract data is not made up of extractionMultiplier sized records - complete output with at least empty entries for a record. However, this is not yet used. - Removed the skipLines field, as the whole mechanism was reworked and both index and extract should now produce proper results. - Reworked processDecompressedChunkOfData(), it is now much easier to read and a lot less complicated. Reworked the Starter to accept more command line options and pass them to the specific runners. Changed ZLibBasedFASTQProcessorBaseClass.splitStr so, that it does no longer add an empty line to the back of the list, if the last char is a newline character. TestResourcesAndFunctions - Add readLinesOfFile(), readFile(), readLinesOfResourceFile(), readResourceFile(), compareVectorContent() and getTestVectorWithSimulatedBlockData() to reduce code duplications and improve test readability. Code is taken from IndexerTest and ExtractorTest. - The method getTestVectorWithSimulatedBlockData() will return a vector with files for simulating decompressed gzip block data. This tries to mimic common pitfalls like small blocks, empty blocks, blocks without newlines and so on. IndexerTest - Add a test for correct block line counting which uses the aforementioned block data from getTestVectorWithSimulatedBlockData() - Move code to the aforementioned methods readLinesOfFile() and so on... ExtractorTest - Analogous to IndexerTest, add a test for correct line from block data extraction. This will also use getTestVectorWithSimulatedBlockData() - Like in IndexerTest, use the extracted methods for file reading and vector comparison. Add the blockAndLineCalculations test resource directory. This contains several files with "decompressed" block data and a file which describes the layout of the joined block files.
- Loading branch information
heinold
committed
Jun 7, 2019
1 parent
3a2fa78
commit 16851f4
Showing
29 changed files
with
625 additions
and
214 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,13 +1,16 @@ | ||
.idea/workspace.xml | ||
*.a | ||
test/Testing | ||
CMakeFiles | ||
*.cmake | ||
*.cbp | ||
Makefile | ||
*.fastq | ||
*.fq | ||
CMakeCache.txt | ||
CMakeFiles | ||
Makefile | ||
Testing | ||
cmake-build-debug/ | ||
release | ||
debug | ||
cmake-build-* | ||
src/fastqindex | ||
test/testapp | ||
|
||
test/Testing |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.