Stop dumping GAM to stdout when reading >1k BED or GFF records #4363

adamnovak · 2024-07-31T19:50:37Z

Changelog Entry

To be copied to the draft changelog by merger:

When reading more than 1000 BED or GFF records, vg will no longer dump the first records to standard output and forget about them

Description

In ae44fca @jmonlong improved memory usage when converting large BED and GFF files to GAM, by adding points at which we flush the output buffer. But when I merged #4248 I didn't notice that that buffer isn't always an output buffer and is sometimes an internal collection of annotations to work with later.

So whenever we tried to load and keep a lot of annotations, we would dump them out in GAM chunks of 1000 records each and throw them away, whenever we hit 1000 loaded records.

This fixes that.

jmonlong · 2024-08-01T07:50:16Z

Sorry about the bug I introduced @adamnovak. I didn't notice that region-parsing function was used somewhere else...

I noticed the issue this week when working to make vg annotate output GAF. In #4364 I think we get both the fix and GAF output feature. If it looks good to you, could we merge this one (or add a GAF output mode to yours)?

jmonlong and others added 3 commits July 30, 2024 13:53

output GAF when vg annotating

c79123c

Stop dumping GAM to stdout when reading >1k BED or GFF records

94fc9b2

fix comments and propagate adam's change

553235d

jmonlong mentioned this pull request Aug 1, 2024

Annotate writes GAF too and fix to unwanted GAM dumping to stdout #4364

Closed

jmonlong and others added 3 commits August 1, 2024 16:51

removed temporary fields at the end of gamsorting

4e2a6bd

Merge remote-tracking branch 'jean/annotate_gaf' into read-quietly

23518fb

Merge remote-tracking branch 'jean/annotate_gaf' into read-quietly

12a1f50

adamnovak merged commit 683aa18 into master Aug 7, 2024
2 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Stop dumping GAM to stdout when reading >1k BED or GFF records #4363

Stop dumping GAM to stdout when reading >1k BED or GFF records #4363

adamnovak commented Jul 31, 2024

jmonlong commented Aug 1, 2024

Stop dumping GAM to stdout when reading >1k BED or GFF records #4363

Stop dumping GAM to stdout when reading >1k BED or GFF records #4363

Conversation

adamnovak commented Jul 31, 2024

Changelog Entry

Description

jmonlong commented Aug 1, 2024