Skip to content

Releases: Natooz/MidiTok

v1.1.9 MASK tokens & bugfixes

16 Feb 09:08
Compare
Choose a tag to compare

Changes

  • d2a3404 22c15f2 When tokenizing, files not found can now be logged
  • 1f74e3a MASK tokens now available, use your_tokenizer.vocab.add_mask() to add in in your vocabulary.
  • 1f74e3a fix to a possible bug when using custom vocabulary indexes, then non custom. Now checking indexes before registering tokens.
  • 2f87765 merge_tracks() now also merges sustain pedal, control change and pitch bend messages with the effects arg, and now handles List[Instrument] as well as MidiFile objects.
  • 2e77eec MIDI-Like token_types_errors() now looks for Note-Off for each Note-On tokens.

Compatibility

All good !

v1.1.8 Time Signature tokens for Octuple

24 Jan 14:16
Compare
Choose a tag to compare

Changes

  • 3c3c5da TimeSignature tokens are implemented for Octuple ! Thanks @ilya16 for this great contribution ! These tokens are optional and can be set with the additional_tokens parameter.
  • df1edd1 Added a fail-check for Bar / Pos based tokenizations, for when a token sequence begins by a Position tokens before any Bar.
  • 5ab55f4 Bugfix when loading tokenizer params from config file with tempos.
  • 08540a2 SOS and EOS tokens are not adresses to -1 and -2 anymore as this could led to issues.

Compatibility

SOS and EOS tokens saved with v1.1.7 and before will not be compatible anymore.
You can however easily convert them. You just have to convert SOS (-1) and EOS(-2) tokens to respectively len(tokenizer.vocab) and len(tokenizer.vocab) + 1.

v1.1.7 Class renamed

05 Jan 11:28
Compare
Choose a tag to compare

Changes

  • 195d549 Tokenizer classes are renamed: the 'Encoding' suffixe is removed. Old class names still exist / work but will be removed in the future (a warning is called when using them)
  • 195d549 constants import modified, now has to be accessed miditok.constants.A_CONSTANT
  • 3ed5532 PAD token type are now handled in token_types_errors methods

v1.1.6 Speed up

07 Dec 14:21
Compare
Choose a tag to compare

Changes

  • 714f5f5 Speed up duration / time-shift computations
  • 5fe18be Speed up quantization of velocity and tempo values

Special thanks to @ilya16 !

v1.1.5 Fixes and debugging

01 Dec 14:42
Compare
Choose a tag to compare

Changes

  • c7169fe rests no longer append a bar token when crossing a new bar (bugfix)
  • 9457178 fix in token types graph for REMI / CP Word
  • 8a6da14 events_to_tokens and tokens_to_events no longer protected methods, to use for debugging

Compatibility:

  • MIDI files tokenized with REMI and CP Word using Rests, with v1.1.4 and below might not be compatible as the decoding process changed (c7169fe)

v1.1.4 Bugfix rest detections

18 Nov 09:07
Compare
Choose a tag to compare

v1.1.3 Bugfix chord detection

04 Nov 14:19
Compare
Choose a tag to compare
  • 9a5975c bugfix in the chord detection method, was comparing lists with tuples for chord qualities

v1.1.2 Token sequence types validation & Bugfixes

28 Oct 16:47
Compare
Choose a tag to compare

Changes:

  • da36b4a token_types_errors method introduced, its allows to check if a generated sequence of tokens is constituted of valid token types successions and values. This rule-based metric is useful to measure if a network understands the "semantic" of a tokenization strategy. Note that the validation is calculated differently following the tokenization strategy, we refer you to the docstring.
  • Fixes in _create_token_types_graph for CP Word, MIDI-Like, MuMIDI and Remi
  • 2dc9fae When using Rests with Remi and CP Word, a Bar token is put after a/several Rest token(s) if the rest crossed one or several bars.
  • token_types_errors is included in the tests scripts

Compatibility:

  • Tokens previously created with Remi or CP Word using Rests may not be compatible with v1.1.2

v1.1.1 Program tokens and tokens_types_graph

26 Oct 17:29
Compare
Choose a tag to compare

Changes:

  • c3d6c89 new attribute tokens_types_graph for every class, to be used to check if a generated sequence is made of valid token successions
  • bd9aade Program token type is part of additional_tokens attribute. MidiTok never use them, its here for you if you need it

Compatibility:

  • _create_token_types_graph is called by MIDITokenizer's constructor, your custom classes should then implement it (can return None)
  • Your datasets tokenized with <= v1.1.0 will stay compatible but you won't be able to load them without adding the 'Program' key in the config.txt files.

v1.1.0 Vocabulary class

26 Sep 10:02
Compare
Choose a tag to compare
  • 5826734 New Vocabulary class handling vocabulary creation, and event <--> token mapping
  • ea49656 SOS and EOS tokens are not included by default, sos_eos_tokens let you handle this
  • 9a9db25 save_tokens and load_tokens methods
  • 9a9db25 bugfix in detect_chords with only_known_chord param