Skip to content

v1.1.2 Token sequence types validation & Bugfixes

Compare
Choose a tag to compare
@Natooz Natooz released this 28 Oct 16:47
· 465 commits to main since this release

Changes:

  • da36b4a token_types_errors method introduced, its allows to check if a generated sequence of tokens is constituted of valid token types successions and values. This rule-based metric is useful to measure if a network understands the "semantic" of a tokenization strategy. Note that the validation is calculated differently following the tokenization strategy, we refer you to the docstring.
  • Fixes in _create_token_types_graph for CP Word, MIDI-Like, MuMIDI and Remi
  • 2dc9fae When using Rests with Remi and CP Word, a Bar token is put after a/several Rest token(s) if the rest crossed one or several bars.
  • token_types_errors is included in the tests scripts

Compatibility:

  • Tokens previously created with Remi or CP Word using Rests may not be compatible with v1.1.2