Skip to content
This repository has been archived by the owner on Dec 16, 2022. It is now read-only.

Releases: allenai/allennlp

v2.0.0rc1

22 Jan 01:20
Compare
Choose a tag to compare
v2.0.0rc1 Pre-release
Pre-release

This is the first (and hopefully only) release candidate for AllenNLP 2.0. Please note that this is a release candidate, and the APIs are still subject to change until the final 2.0 release. We'll provide a detailed writeup with the final 2.0 release, including a migration guide. In the meantime, here are the headline features of AllenNLP 2.0:

  • Support for models that combine language and vision features
  • Transformer Toolkit, a suite of classes and components that make it easy to experiment with transformer architectures
  • A framework for multitask training
  • Revamped data loading, for improved performance and flexibility

What's new

Added 🎉

  • Added TensorCache class for caching tensors on disk
  • Added abstraction and concrete implementation for image loading
  • Added abstraction and concrete implementation for GridEmbedder
  • Added abstraction and demo implementation for an image augmentation module.
  • Added abstraction and concrete implementation for region detectors.
  • A new high-performance default DataLoader: MultiProcessDataLoading.
  • A MultiTaskModel and abstractions to use with it, including Backbone and Head. The
    MultiTaskModel first runs its inputs through the Backbone, then passes the result (and
    whatever other relevant inputs it got) to each Head that's in use.
  • A MultiTaskDataLoader, with a corresponding MultiTaskDatasetReader, and a couple of new
    configuration objects: MultiTaskEpochSampler (for deciding what proportion to sample from each
    dataset at every epoch) and a MultiTaskScheduler (for ordering the instances within an epoch).
  • Transformer toolkit to plug and play with modular components of transformer architectures.
  • Added a command to count the number of instances we're going to be training with
  • Added a FileLock class to common.file_utils. This is just like the FileLock from the filelock library, except that
    it adds an optional flag read_only_ok: bool, which when set to True changes the behavior so that a warning will be emitted
    instead of an exception when lacking write permissions on an existing file lock.
    This makes it possible to use the FileLock class on a read-only file system.
  • Added a new learning rate scheduler: CombinedLearningRateScheduler. This can be used to combine different LR schedulers, using one after the other.
  • Added an official CUDA 10.1 Docker image.
  • Moving ModelCard and TaskCard abstractions into the main repository.
  • Added a util function allennlp.nn.util.dist_reduce(...) for handling distributed reductions.
    This is especially useful when implementing a distributed Metric.

Changed ⚠️

  • DatasetReaders are now always lazy. This means there is no lazy parameter in the base
    class, and the _read() method should always be a generator.
  • The DataLoader now decides whether to load instances lazily or not.
    With the PyTorchDataLoader this is controlled with the lazy parameter, but with
    the MultiProcessDataLoading this is controlled by the max_instances_in_memory setting.
  • ArrayField is now called TensorField, and implemented in terms of torch tensors, not numpy.
  • Improved nn.util.move_to_device function by avoiding an unnecessary recursive check for tensors and
    adding a non_blocking optional argument, which is the same argument as in torch.Tensor.to().
  • If you are trying to create a heterogeneous batch, you now get a better error message.
  • Readers using the new vision features now explicitly log how they are featurizing images.
  • master_addr and master_port renamed to primary_addr and primary_port, respectively.
  • is_master parameter for training callbacks renamed to is_primary.
  • master branch renamed to main
  • Torch version bumped to 1.7.1 in Docker images.

Removed 👋

  • Removed nn.util.has_tensor.

Fixed ✅

  • The build-vocab command no longer crashes when the resulting vocab file is
    in the current working directory.
  • Fixed typo with LabelField string representation: removed trailing apostrophe.
  • Vocabulary.from_files and cached_path will issue a warning, instead of failing, when a lock on an existing resource
    can't be acquired because the file system is read-only.
  • TrackEpochCallback is now a EpochCallback.

Commits

9a4a424 Moves vision models to allennlp-models (#4918)
412896b fix merge conflicts
ed322eb A helper for distributed reductions (#4920)
9ab2bf0 add CUDA 10.1 Docker image (#4921)
d82287e Update transformers requirement from <4.1,>=4.0 to >=4.0,<4.2 (#4872)
5497394 Multitask example (#4898)
0f00d4d resolve _read type (#4916)
5229da8 Toolkit decoder (#4914)
4183a49 Update mkdocs-material requirement from <6.2.0,>=5.5.0 to >=5.5.0,<6.3.0 (#4880)
d7c9eab improve worker error handling in MultiProcessDataLoader (#4912)
94dd9cc rename 'master' -> 'primary' for distributed training (#4910)
c9585af fix imports in file_utils
03c7ffb Merge branch 'main' into vision
effcc4e improve data loading docs (#4909)
2f54570 remove PyTorchDataLoader, add SimpleDataLoader for testing (#4907)
31ec6a5 MultiProcessDataLoader takes PathLike data_path (#4908)
5e3757b rename 'multi_process_*' -> 'multiprocess' for consistency (#4906)
df36636 Data loading cuda device (#4879)
aedd3be Toolkit: Cleaning up TransformerEmbeddings (#4900)
54e85ee disable codecov annotations (#4902)
2623c4b Making TrackEpochCallback an EpochCallback (#4893)
1d21c75 issue warning instead of failing when lock can't be acquired on a resource that exists in a read-only file system (#4867)
ec197c3 Create pull_request_template.md (#4891)
15d32da Make GQA work (#4884)
fbab0bd import MultiTaskDataLoader to data_loaders/init.py (#4885)
d1cc146 Merge branch 'main' into vision
abacc01 Adding f1 score (#4890)
9cf41b2 fix navbar link
9635af8 rename 'master' -> 'main' (#4887)
d0a07fb docs: fix simple typo, multplication -> multiplication (#4883)
d1f032d Moving modelcard and taskcard abstractions to main repo (#4881)
f62b819 Make images easier to find for Visual Entailment (#4878)
1fff7ca Update docker torch version (#4873)
7a7c7ea Only cache, no featurizing (#4870)
d2aea97 Fix typo in str (#4874)
1c72a30 Merge branch 'master' into vision
6a8d425 add CombinedLearningRateScheduler (#4871)
85d38ff doc fixes
c4e3f77 Switch to torchvision for vision components 👀, simplify and improve MultiProcessDataLoader (#4821)
3da8e62 Merge branch 'master' into vision
a3732d0 Fix cache volume (#4869)
832901e Turn superfluous warning to info when extending the vocab in the embedding matrix (#4854)
147fefe Merge branch 'master' into vision
87e3536 Make tests work again (#4865)
d16a5c7 Merge remote-tracking branch 'origin/master' into vision
457e56e Merge branch 'master' into vision
c8521d8 Toolkit: Adding documentation and small changes for BiModalAttention (#4859)
ddbc740 gqa reader fixes during vilbert training (#4851)
50e50df Generalizing transformer layers (#4776)
52fdd75 adding multilabel option (#4843)
7887119 Other VQA datasets (#4834)
e729e9a Added GQA reader (#4832)
52e9dd9 Visual entailment model code (#4822)
01f3a2d Merge remote-tracking branch 'origin/master' into vision
3be6c97 SNLI_VE dataset reader (#4799)
b659e66 VQAv2 (#4639)
c787230 Merge remote-tracking branch 'origin/master' into vision
db2d1d3 Merge branch 'master' into vision
6bf1924 Merge branch 'master' into vision
167bcaa remove vision push trigger
7591465 Merge remote-tracking branch 'origin/master' into vision
22d4633 improve independence of vision components (#4793)
98018cc fix merge conflicts
c780315 fix merge conflicts
5d22ce6 Merge remote-tracking branch 'origin/master' into vision
602399c update with master
ffafaf6 Multitask data loading and scheduling (#4625)
7c47c3a Merge branch 'master' into vision
12c8d1b Generalizing self attention (#4756)
63f61f0 Merge remote-tracking branch 'origin/master' into vision
b48347b Merge remote-tracking branch 'origin/master' into vision
81892db fix failing tests
98edd25 update torch requirement
8da3508 update with master
cc53afe separating TransformerPooler as a new module (#4730)
4ccfa88 Transformer toolkit: BiModalEncoder now has separate num_attention_heads for both modalities (#4728)
91631ef Transformer toolkit (#4577)
677a9ce Merge remote-tracking branch 'origin/master' into vision
2985236 This should have been part of the previously merged PR
c5d264a Detectron NLVR2 (#4481)
e39a5f6 Merge remote-tracking branch 'origin/master' into vision
f1e46fd Add MultiTaskModel (#4601)
fa22f73 Merge remote-tracking branch 'origin/master' into vision
41872ae Merge remote-tracking branch 'origin/master' into vision
f886fd0 Merge remote-tracking branch 'origin/master' into vision
191b641 make existing readers work with multi-process loading (#4597)
d7124d4 fix len calculation for new data loader (#4618)
8746361 Merge branch 'master' into vision
319794a remove duplicate padding calculations in collate fn (#4617)
de9165e rename 'node_rank' to 'global_rank' in dataset reader 'DistributedInfo' (#4608)
3d11419 Formatting updates for new version of black (#4607)
cde06e6 Changelog
1b08fd6 ensure models check runs on right branch
44c8791 ensure vision CI runs on each commit (#4582)
95e8253 Merge branch 'master' into vision
e74a736 new data loading (#4497)
6f82005 Merge remote-tracking branch 'origin/master' into vision
a7d45de Initializing a VilBERT model from a pre-trained transformer (#4495)
3833f7a Merge branch 'master' into vision
71d7cb4 Merge branch 'master' into vision
3137961 Merge remote-tracking branch 'origin/master' into vision
6cc508d Merge branch 'master' into vision
f87df83 Merge remote-tracking branch 'origin/master' into vision
0bbe84b An initial VilBERT model for NLVR...

Read more

v1.3.0

15 Dec 22:10
Compare
Choose a tag to compare

What's new

Added 🎉

  • Added links to source code in docs.
  • Added get_embedding_layer and get_text_field_embedder to the Predictor class; to specify embedding layers for non-AllenNLP models.
  • Added Gaussian Error Linear Unit (GELU) as an Activation.

Changed ⚠️

  • Renamed module allennlp.data.tokenizers.token to allennlp.data.tokenizers.token_class to avoid
    this bug.
  • transformers dependency updated to version 4.0.1.

Fixed ✅

  • Fixed a lot of instances where tensors were first created and then sent to a device
    with .to(device). Instead, these tensors are now created directly on the target device.
  • Fixed issue with GradientDescentTrainer when constructed with validation_data_loader=None and learning_rate_scheduler!=None.
  • Fixed a bug when removing all handlers in root logger.
  • ShardedDatasetReader now inherits parameters from base_reader when required.
  • Fixed an issue in FromParams where parameters in the params object used to a construct a class
    were not passed to the constructor if the value of the parameter was equal to the default value.
    This caused bugs in some edge cases where a subclass that takes **kwargs needs to inspect
    kwargs before passing them to its superclass.
  • Improved the band-aid solution for segmentation faults and the "ImportError: dlopen: cannot load any more object with static TLS"
    by adding a transformers import.
  • Added safety checks for extracting tar files

Commits

d408f41 log import errors for default plugins (#4866)
f2a5331 Adds a safety check for tar files (#4858)
84a36a0 Update transformers requirement from <3.6,>=3.4 to >=4.0,<4.1 (#4831)
fdad31a Add ability to specify the embedding layer if the model does not use TextFieldEmbedder (#4836)
41c5224 Improve the band-aid solution for seg faults and the static TLS error (#4846)
63b6d16 fix FromParams bug (#4841)
6c3238e rename token.py -> token_class.py (#4842)
cec9209 Several micro optimizations (#4833)
48a4865 Add GELU activation (#4828)
3e62365 Bugfix for attribute inheritance in ShardedDatasetReader (#4830)
458c4c2 fix the way handlers are removed from the root logger (#4829)
5b30658 Fix bug in GradientDescentTrainer when validation data is absent (#4811)
f353c6c add link to source code in docs (#4807)
0a83271 No Docker auth on PRs (#4802)
ad8e8a0 no ssh setup on PRs (#4801)

v1.2.2

17 Nov 18:26
Compare
Choose a tag to compare

What's new

Added 🎉

  • Added Docker builds for other torch-supported versions of CUDA.
  • Adds allennlp-semparse as an official, default plugin.

Fixed ✅

  • GumbelSampler now sorts the beams by their true log prob.

Commits

023d9bc Prepare for release v1.2.2
7b0826c push commit images for both CUDA versions
3cad5b4 fix AUC test (#4795)
efde092 upgrade ssh-agent action (#4797)
ec37dd4 Docker builds for other CUDA versions, improve CI (#4796)
0d8873c doc link quickfix
e4cc95c improve plugin section in README (#4789)
d99f7f8 ensure Gumbel sorts beams by true log prob (#4786)
9fe8d90 Makes the transformer cache work with custom kwargs (#4781)
1e7492d Update transformers requirement from <3.5,>=3.4 to >=3.4,<3.6 (#4784)
f27ef38 Fixes pretrained embeddings for transformers that don't have end tokens (#4732)

v1.2.1

11 Nov 00:22
Compare
Choose a tag to compare

What's new

Added 🎉

  • Added an optional seed parameter to ModelTestCase.set_up_model which sets the random
    seed for random, numpy, and torch.
  • Added support for a global plugins file at ~/.allennlp/plugins.
  • Added more documentation about plugins.
  • Added sampler class and parameter in beam search for non-deterministic search, with several
    implementations, including MultinomialSampler, TopKSampler, TopPSampler, and
    GumbelMaxSampler. Utilizing GumbelMaxSampler will give Stochastic Beam Search.

Changed ⚠️

  • Pass batch metrics to BatchCallback.

Fixed ✅

  • Fixed a bug where forward hooks were not cleaned up with saliency interpreters if there
    was an exception.
  • Fixed the computation of saliency maps in the Interpret code when using mismatched indexing.
    Previously, we would compute gradients from the top of the transformer, after aggregation from
    wordpieces to tokens, which gives results that are not very informative. Now, we compute gradients
    with respect to the embedding layer, and aggregate wordpieces to tokens separately.
  • Fixed the heuristics for finding embedding layers in the case of RoBERTa. An update in the
    transformers library broke our old heuristic.
  • Fixed typo with registered name of ROUGE metric. Previously was rogue, fixed to rouge.
  • Fixed default masks that were erroneously created on the CPU even when a GPU is available.

Commits

04247fa support global plugins file, improve plugins docs (#4779)
9f7cc24 Add sampling strategies to beam search (#4768)
f6fe8c6 pin urllib3 in dev reqs for responses (#4780)
764bbe2 Pass batch metrics to BatchCallback (#4764)
dc3a4f6 clean up forward hooks on exception (#4778)
fcc3a70 Fix: typo in metric, rogue -> rouge (#4777)
b89320c Set the device for an auto-created mask (#4774)
92a844a RoBERTa embeddings are no longer a type of BERT embeddings (#4771)
23f0a8a Ensure cnn_encoder respects masking (#4746)
b4f1a7a add seed option to ModelTestCase.set_up_model (#4769)
b7cec51 Made Interpret code handle mismatched cases better (#4733)
9759b15 allow TextFieldEmbedder to have EmptyEmbedder that may not be in input (#4761)

v1.2.0

29 Oct 21:37
Compare
Choose a tag to compare

What's new

Changed ⚠️

  • Enforced stricter typing requirements around the use of Optional[T] types.
  • Changed the behavior of Lazy types in from_params methods. Previously, if you defined a Lazy parameter like
    foo: Lazy[Foo] = None in a custom from_params classmethod, then foo would actually never be None.
    This behavior is now different. If no params were given for foo, it will be None.
    You can also now set default values for foo like foo: Lazy[Foo] = Lazy(Foo).
    Or, if you want you want a default value but also want to allow for None values, you can
    write it like this: foo: Optional[Lazy[Foo]] = Lazy(Foo).
  • Added support for PyTorch version 1.7.

Fixed ✅

  • Made it possible to instantiate TrainerCallback from config files.
  • Fixed the remaining broken internal links in the API docs.
  • Fixed a bug where Hotflip would crash with a model that had multiple TokenIndexers and the input
    used rare vocabulary items.
  • Fixed a bug where BeamSearch would fail if max_steps was equal to 1.

Commits

7f85c74 fix docker build (#4762)
cc9ac0f ensure dataclasses not installed in CI (#4754)
812ac57 Fix hotflip bug where vocab items were not re-encoded correctly (#4759)
aeb6d36 revert samplers and fix bug when max_steps=1 (#4760)
baca754 Make returning token type id default in transformers intra word tokenization. (#4758)
5d6670c Update torch requirement from <1.7.0,>=1.6.0 to >=1.6.0,<1.8.0 (#4753)
0ad228d a few small doc fixes (#4752)
71a98c2 stricter typing for Optional[T] types, improve handling of Lazy params (#4743)
27edfbf Add end+trainer callbacks to Trainer.from_partial_objects (#4751)
b792c83 Fix device mismatch bug for categorical accuracy metric in distributed training (#4744)

v1.2.0rc1

22 Oct 21:45
Compare
Choose a tag to compare
v1.2.0rc1 Pre-release
Pre-release

What's new

Added 🎉

  • Added a warning when batches_per_epoch for the validation data loader is inherited from
    the train data loader.
  • Added a build-vocab subcommand that can be used to build a vocabulary from a training config file.
  • Added tokenizer_kwargs argument to PretrainedTransformerMismatchedIndexer.
  • Added tokenizer_kwargs and transformer_kwargs arguments to PretrainedTransformerMismatchedEmbedder.
  • Added official support for Python 3.8.
  • Added a script: scripts/release_notes.py, which automatically prepares markdown release notes from the
    CHANGELOG and commit history.
  • Added a flag --predictions-output-file to the evaluate command, which tells AllenNLP to write the
    predictions from the given dataset to the file as JSON lines.
  • Added the ability to ignore certain missing keys when loading a model from an archive. This is done
    by adding a class-level variable called authorized_missing_keys to any PyTorch module that a Model uses.
    If defined, authorized_missing_keys should be a list of regex string patterns.
  • Added FBetaMultiLabelMeasure, a multi-label Fbeta metric. This is a subclass of the existing FBetaMeasure.
  • Added ability to pass additional key word arguments to cached_transformers.get(), which will be passed on to AutoModel.from_pretrained().
  • Added an overrides argument to Predictor.from_path().
  • Added a cached-path command.
  • Added a function inspect_cache to common.file_utils that prints useful information about the cache. This can also
    be used from the cached-path command with allennlp cached-path --inspect.
  • Added a function remove_cache_entries to common.file_utils that removes any cache entries matching the given
    glob patterns. This can used from the cached-path command with allennlp cached-path --remove some-files-*.
  • Added logging for the main process when running in distributed mode.
  • Added a TrainerCallback object to support state sharing between batch and epoch-level training callbacks.
  • Added support for .tar.gz in PretrainedModelInitializer.
  • Added classes: nn/samplers/samplers.py with MultinomialSampler, TopKSampler, and TopPSampler for
    sampling indices from log probabilities
  • Made BeamSearch registrable.
  • Added top_k_sampling and type_p_sampling BeamSearch implementations.
  • Pass serialization_dir to Model and DatasetReader.
  • Added an optional include_in_archive parameter to the top-level of configuration files. When specified, include_in_archive should be a list of paths relative to the serialization directory which will be bundled up with the final archived model from a training run.

Changed ⚠️

  • Subcommands that don't require plugins will no longer cause plugins to be loaded or have an --include-package flag.
  • Allow overrides to be JSON string or dict.
  • transformers dependency updated to version 3.1.0.
  • When cached_path is called on a local archive with extract_archive=True, the archive is now extracted into a unique subdirectory of the cache root instead of a subdirectory of the archive's directory. The extraction directory is also unique to the modification time of the archive, so if the file changes, subsequent calls to cached_path will know to re-extract the archive.
  • Removed the truncation_strategy parameter to PretrainedTransformerTokenizer. The way we're calling the tokenizer, the truncation strategy takes no effect anyways.
  • Don't use initializers when loading a model, as it is not needed.
  • Distributed training will now automatically search for a local open port if the master_port parameter is not provided.
  • In training, save model weights before evaluation.
  • allennlp.common.util.peak_memory_mb renamed to peak_cpu_memory, and allennlp.common.util.gpu_memory_mb renamed to peak_gpu_memory,
    and they both now return the results in bytes as integers. Also, the peak_gpu_memory function now utilizes PyTorch functions to find the memory
    usage instead of shelling out to the nvidia-smi command. This is more efficient and also more accurate because it only takes
    into account the tensor allocations of the current PyTorch process.
  • Make sure weights are first loaded to the cpu when using PretrainedModelInitializer, preventing wasted GPU memory.
  • Load dataset readers in load_archive.
  • Updated AllenNlpTestCase docstring to remove reference to unittest.TestCase

Removed 👋

  • Removed common.util.is_master function.

Fixed ✅

  • Fixed a bug where the reported batch_loss metric was incorrect when training with gradient accumulation.
  • Class decorators now displayed in API docs.
  • Fixed up the documentation for the allennlp.nn.beam_search module.
  • Ignore *args when constructing classes with FromParams.
  • Ensured some consistency in the types of the values that metrics return.
  • Fix a PyTorch warning by explicitly providing the as_tuple argument (leaving
    it as its default value of False) to Tensor.nonzero().
  • Remove temporary directory when extracting model archive in load_archive
    at end of function rather than via atexit.
  • Fixed a bug where using cached_path() offline could return a cached resource's lock file instead
    of the cache file.
  • Fixed a bug where cached_path() would fail if passed a cache_dir with the user home shortcut ~/.
  • Fixed a bug in our doc building script where markdown links did not render properly
    if the "href" part of the link (the part inside the ()) was on a new line.
  • Changed how gradients are zeroed out with an optimization. See this video from NVIDIA
    at around the 9 minute mark.
  • Fixed a bug where parameters to a FromParams class that are dictionaries wouldn't get logged
    when an instance is instantiated from_params.
  • Fixed a bug in distributed training where the vocab would be saved from every worker, when it should have been saved by only the local master process.
  • Fixed a bug in the calculation of rouge metrics during distributed training where the total sequence count was not being aggregated across GPUs.
  • Fixed allennlp.nn.util.add_sentence_boundary_token_ids() to use device parameter of input tensor.
  • Be sure to close the TensorBoard writer even when training doesn't finish.
  • Fixed the docstring for PyTorchSeq2VecWrapper.

Commits

01644ca Pass serialization_dir to Model, DatasetReader, and support include_in_archive (#4713)
1f29f35 Update transformers requirement from <3.4,>=3.1 to >=3.1,<3.5 (#4741)
6bb9ce9 warn about batches_per_epoch with validation loader (#4735)
00bb6c5 Be sure to close the TensorBoard writer (#4731)
3f23938 Update mkdocs-material requirement from <6.1.0,>=5.5.0 to >=5.5.0,<6.2.0 (#4738)
10c11ce Fix typo in PretrainedTransformerMismatchedEmbedder docstring (#4737)
0e64b4d fix docstring for PyTorchSeq2VecWrapper (#4734)
006bab4 Don't use PretrainedModelInitializer when loading a model (#4711)
ce14bdc Allow usage of .tar.gz with PretrainedModelInitializer (#4709)
c14a056 avoid defaulting to CPU device in add_sentence_boundary_token_ids() (#4727)
24519fd fix typehint on checkpointer method (#4726)
d3c69f7 Bump mypy from 0.782 to 0.790 (#4723)
cccad29 Updated AllenNlpTestCase docstring (#4722)
3a85e35 add reasonable timeout to gpu checks job (#4719)
1ff0658 Added logging for the main process when running in distributed mode (#4710)
b099b69 Add top_k and top_p sampling to BeamSearch (#4695)
bc6f15a Fixes rouge metric calculation corrected for distributed training (#4717)
ae7cf85 automatically find local open port in distributed training (#4696)
321d4f4 TrainerCallback with batch/epoch/end hooks (#4708)
001e1f7 new way of setting env variables in GH Actions (#4700)
c14ea40 Save checkpoint before running evaluation (#4704)
40bb47a Load weights to cpu with PretrainedModelInitializer (#4712)
327188b improve memory helper functions (#4699)
90f0037 fix reported batch_loss (#4706)
39ddb52 CLI improvements (#4692)
edcb6d3 Fix a bug in saving vocab during distributed training (#4705)
3506e3f ensure parameters that are actual dictionaries get logged (#4697)
eb7f256 Add StackOverflow link to README (#4694)
17c3b84 Fix small typo (#4686)
e0b2e26 display class decorators in API docs (#4685)
b9a9284 Update transformers requirement from <3.3,>=3.1 to >=3.1,<3.4 (#4684)
d9bdaa9 add build-vocab command (#4655)
ce604f1 Update mkdocs-material requirement from <5.6.0,>=5.5.0 to >=5.5.0,<6.1.0 (#4679)
c3b5ed7 zero grad optimization (#4673)
9dabf3f Add missing tokenizer/transformer kwargs (#4682)
9ac6c76 Allow overrides to be JSON string or dict (#4680)
55cfb47 The truncation setting doesn't do anything anymore (#4672)
990c9c1 clarify conda Python version in README.md
97db538 official support for Python 3.8 🐍 (#4671)
1e381bb Clean up the documentation for beam search (#4664)
11def8e Update bug_report.md
97fe88d Cached path command (#4652)
c9f376b Update transformers requirement from <3.2,>=3.1 to >=3.1,<3.3 (#4663)
e5e3d02 tick version for nightly releases
b833f90 fix multi-line links in docs (#4660)
d7c06fe Expose from_pretrained keyword arguments (#4651)
175c76b fix confusing distributed logging info (#4654)
fbd2ccc fix numbering in RELEASE_GUIDE
2d5f24b improve how cached_path extracts archives (#4645)
824f97d smooth out release process (#4648)
c7b7c00 Feature/prevent temp directory retention (#4643)
de5d68b Fix tensor.nonzero() function overload warning (#4644)
e8e89d5 add flag for saving predictions to 'evaluate' command (#4637)
e4fd5a0 Multi-label F-beta metric (#4562)
f0e7a78 Create Dependabot config file (#4635)
0e33b0b Return consistent types from metrics (#4632)
2df364f Update transformers requirement from <3.1,>=3.0 to >=3.0,<3.2 (#4621)
6d480aa Im...

Read more

v1.1.0

08 Sep 20:32
Compare
Choose a tag to compare

Highlights

Version 1.1 was mainly focused on bug fixes, but there are a few important new features such as gradient checkpointing with pretrained transformer embedders and official support for automatic mixed precision (AMP) training through the new torch.amp module.

Details

Added

  • Predictor.capture_model_internals() now accepts a regex specifying which modules to capture.
  • Added the option to specify requires_grad: false within an optimizer's parameter groups.
  • Added the file-friendly-logging flag back to the train command. Also added this flag to the predict, evaluate, and find-learning-rate commands.
  • Added an EpochCallback to track current epoch as a model class member.
  • Added the option to enable or disable gradient checkpointing for transformer token embedders via boolean parameter gradient_checkpointing.
  • Added a method to ModelTestCase for running basic model tests when you aren't using config files.
  • Added some convenience methods for reading files.
  • cached_path() can now automatically extract and read files inside of archives.
  • Added the ability to pass an archive file instead of a local directory to Vocab.from_files.
  • Added the ability to pass an archive file instead of a glob to ShardedDatasetReader.
  • Added a new "linear_with_warmup" learning rate scheduler.
  • Added a check in ShardedDatasetReader that ensures the base reader doesn't implement manual distributed sharding itself.
  • Added an option to PretrainedTransformerEmbedder and PretrainedTransformerMismatchedEmbedder to use a scalar mix of all hidden layers from the transformer model instead of just the last layer. To utilize this, just set last_layer_only to False.
  • Training metrics now include batch_loss and batch_reg_loss in addition to aggregate loss across number of batches.

Changed

  • Upgraded PyTorch requirement to 1.6.
  • Beam search now supports multi-layer decoders.
  • Replaced the NVIDIA Apex AMP module with torch's native AMP module. The default trainer (GradientDescentTrainer) now takes a use_amp: bool parameter instead of the old opt_level: str parameter.
  • Not specifying a cuda_device now automatically determines whether to use a GPU or not.
  • Discovered plugins are logged so you can see what was loaded.
  • allennlp.data.DataLoader is now an abstract registrable class. The default implementation remains the same, but was renamed to allennlp.data.PyTorchDataLoader.
  • BertPooler can now unwrap and re-wrap extra dimensions if necessary.

Removed

  • Removed the opt_level parameter to Model.load and load_archive. In order to use AMP with a loaded model now, just run the model's forward pass within torch's autocast context.

Fixed

  • Fixed handling of some edge cases when constructing classes with FromParams where the class
    accepts **kwargs.
  • Fixed division by zero error when there are zero-length spans in the input to a
    PretrainedTransformerMismatchedIndexer.
  • Improved robustness of cached_path when extracting archives so that the cache won't be corrupted
    if a failure occurs during extraction.
  • Fixed a bug with the average and evalb_bracketing_score metrics in distributed training.
  • Fixed a bug in distributed metrics that caused nan values due to repeated addition of an accumulated variable.
  • Fixed how truncation was handled with PretrainedTransformerTokenizer.
    Previously, if max_length was set to None, the tokenizer would still do truncation if the
    transformer model had a default max length in its config.
    Also, when max_length was set to a non-None value, several warnings would appear
    for certain transformer models around the use of the truncation parameter.
  • Fixed evaluation of all metrics when using distributed training.
  • Added a py.typed marker. Fixed type annotations in allennlp.training.util.
  • Fixed problem with automatically detecting whether tokenization is necessary.
    This affected primarily the Roberta SST model.
  • Improved help text for using the --overrides command line flag.
  • Removed unnecessary warning about deadlocks in DataLoader.
  • Fixed testing models that only return a loss when they are in training mode.
  • Fixed a bug in FromParams that caused silent failure in case of the parameter type being Optional[Union[...]].
  • Fixed a bug where the program crashes if evaluation_data_loader is a AllennlpLazyDataset.
  • Reduced the amount of log messages produced by allennlp.common.file_utils.
  • Fixed a bug where PretrainedTransformerEmbedder parameters appeared to be trainable
    in the log output even when train_parameters was set to False.
  • Fixed a bug with the sharded dataset reader where it would only read a fraction of the instances
    in distributed training.
  • Fixed checking equality of ArrayFields.
  • Fixed a bug where NamespaceSwappingField did not work correctly with .empty_field().
  • Put more sensible defaults on the huggingface_adamw optimizer.
  • Simplified logging so that all logging output always goes to one file.
  • Fixed interaction with the python command line debugger.
  • Log the grad norm properly even when we're not clipping it.
  • Fixed a bug where PretrainedModelInitializer fails to initialize a model with a 0-dim tensor
  • Fixed a bug with the layer unfreezing schedule of the SlantedTriangular learning rate scheduler.
  • Fixed a regression with logging in the distributed setting. Only the main worker should write log output to the terminal.
  • Pinned the version of boto3 for package managers (e.g. poetry).
  • Fixed issue #4330 by updating the tokenizers dependency.
  • Fixed a bug in TextClassificationPredictor so that it passes tokenized inputs to the DatasetReader
    in case it does not have a tokenizer.
  • reg_loss is only now returned for models that have some regularization penalty configured.
  • Fixed a bug that prevented cached_path from downloading assets from GitHub releases.
  • Fixed a bug that erroneously increased last label's false positive count in calculating fbeta metrics.
  • Tqdm output now looks much better when the output is being piped or redirected.
  • Small improvements to how the API documentation is rendered.
  • Only show validation progress bar from main process in distributed training.

Commits

dcc9cdc Prepare for release v1.1.0
aa750be fix Average metric (#4624)
e1aa57c improve robustness of cached_path when extracting archives (#4622)
711afaa Fix division by zero when there are zero-length spans in MismatchedEmbedder. (#4615)
be97943 Improve handling of **kwargs in FromParams (#4616)
187b24e add more tutorial links to README (#4613)
e840a58 s/logging/logger/ (#4609)
dbc3c3f Added batched versions of scatter and fill to util.py (#4598)
2c54cf8 reformat for new version of black (#4605)
2dd335e batched_span_select now guarantees element order in each span (#4511)
62f554f specify module names by a regex in predictor.capture_model_internals() (#4585)
f464aa3 Bump markdown-include from 0.5.1 to 0.6.0 (#4586)
d01cdff Update RELEASE_PROCESS.md to include allennlp-models (#4587)
3aedac9 Prepare for release v1.1.0rc4
87a61ad Bug fix in distributed metrics (#4570)
71a9a90 upgrade actions to cache@v2 (#4573)
bd9ee6a Give better usage info for overrides parameter (#4575)
0a456a7 Fix boolean and categorical accuracy for distributed (#4568)
8511274 add actions workflow for closing stale issues (#4561)
de41306 Static type checking fixes (#4545)
5a07009 Fix RoBERTa SST (#4548)
351941f Only pin mkdocs-material to minor version, ignore specific patch version (#4556)
0ac13a4 fix CHANGELOG
3b86f58 Prepare for release v1.1.0rc3
44d2847 Metrics in distributed setting (#4525)
1d61965 Bump mkdocs-material from 5.5.3 to 5.5.5 (#4547)
5b97780 tick version for nightly releases
b32608e add gradient checkpointing for transformer token embedders (#4544)
f639336 Fix logger being created twice (#4538)
660fdaf Fix handling of max length with transformer tokenizers (#4534)
15e288f EpochCallBack for tracking epoch (#4540)
9209bc9 Bump mkdocs-material from 5.5.0 to 5.5.3 (#4533)
bfecdc3 Ensure len(self.evaluation_data_loader) is not called (#4531)
5bc3b73 Fix typo in warning in file_utils (#4527)
e80d768 pin torch >= 1.6
73220d7 Prepare for release v1.1.0rc2
9415350 Update torch requirement from <1.6.0,>=1.5.0 to >=1.5.0,<1.7.0 (#4519)
146bd9e Remove link to self-attention modules. (#4512)
2401282 add back file-friendly-logging flag (#4509)
54e5c83 closes #4494 (#4508)
fa39d49 ensure call methods are rendered in docs (#4522)
e53d185 Bug fix for case when param type is Optional[Union...] (#4510)
14f63b7 Make sure we have a bool tensor where we expect one (#4505)
18a4eb3 add a requires_grad option to param groups (#4502)
6c848df Bump mkdocs-material from 5.4.0 to 5.5.0 (#4507)
d73f8a9 More BART changes (#4500)
1cab3bf Update beam_search.py (#4462)
478bf46 remove deadlock warning in DataLoader (#4487)
714334a Fix reported loss: Bug fix in batch_loss (#4485)
db20b1f use longer tqdm intervals when output being redirected (#4488)
53eeec1 tick version for nightly releases
d693cf1 PathLike (#4479)
2f87832 only show validation progress bar from main process (#4476)
9144918 Fix reported loss (#4477)
5c97083 fix release link in CHANGELOG and formatting in README
4eb9795 Prepare for release v1.1.0rc1
f195440 update 'Models' links in README (#4475)
9c801a3 add CHANGELOG to API docs, point to license on GitHub, improve API doc formatting (#4472)
69d2f03 Clean up Tqdm bars when output is being piped or redirected (#4470)
7b188c9 fixed bug that erronously increased last label's false positive count (#4473)
64db027 Skip ETag check if OSError (#4469)
b9d011e More BART ...

Read more

v1.1.0rc4

20 Aug 18:50
Compare
Choose a tag to compare
v1.1.0rc4 Pre-release
Pre-release

Changes since v1.1.0rc3

Added

  • Added a workflow to GitHub Actions that will automatically close unassigned stale issues and
    ping the assignees of assigned stale issues.

Fixed

  • Fixed a bug in distributed metrics that caused nan values due to repeated addition of an accumulated variable.

Commits

87a61ad Bug fix in distributed metrics (#4570)
71a9a90 upgrade actions to cache@v2 (#4573)
bd9ee6a Give better usage info for overrides parameter (#4575)
0a456a7 Fix boolean and categorical accuracy for distributed (#4568)
8511274 add actions workflow for closing stale issues (#4561)
de41306 Static type checking fixes (#4545)
5a07009 Fix RoBERTa SST (#4548)
351941f Only pin mkdocs-material to minor version, ignore specific patch version (#4556)

v1.1.0rc3

12 Aug 20:18
Compare
Choose a tag to compare
v1.1.0rc3 Pre-release
Pre-release

Changes since v1.1.0rc2

Fixed

  • Fixed how truncation was handled with PretrainedTransformerTokenizer.
    Previously, if max_length was set to None, the tokenizer would still do truncation if the
    transformer model had a default max length in its config.
    Also, when max_length was set to a non-None value, several warnings would appear
    for certain transformer models around the use of the truncation parameter.
  • Fixed evaluation of all metrics when using distributed training.

Commits

0ac13a4 fix CHANGELOG
3b86f58 Prepare for release v1.1.0rc3
44d2847 Metrics in distributed setting (#4525)
1d61965 Bump mkdocs-material from 5.5.3 to 5.5.5 (#4547)
5b97780 tick version for nightly releases
b32608e add gradient checkpointing for transformer token embedders (#4544)
f639336 Fix logger being created twice (#4538)
660fdaf Fix handling of max length with transformer tokenizers (#4534)
15e288f EpochCallBack for tracking epoch (#4540)
9209bc9 Bump mkdocs-material from 5.5.0 to 5.5.3 (#4533)
bfecdc3 Ensure len(self.evaluation_data_loader) is not called (#4531)
5bc3b73 Fix typo in warning in file_utils (#4527)
e80d768 pin torch >= 1.6

v1.1.0rc2

31 Jul 17:03
Compare
Choose a tag to compare
v1.1.0rc2 Pre-release
Pre-release

What's new since v1.1.0rc1

Changed

  • Upgraded PyTorch requirement to 1.6.
  • Replaced the NVIDIA Apex AMP module with torch's native AMP module. The default trainer (GradientDescentTrainer)
    now takes a use_amp: bool parameter instead of the old opt_level: str parameter.

Fixed

  • Removed unnecessary warning about deadlocks in DataLoader.
  • Fixed testing models that only return a loss when they are in training mode.
  • Fixed a bug in FromParams that caused silent failure in case of the parameter type being Optional[Union[...]].

Added

  • Added the option to specify requires_grad: false within an optimizer's parameter groups.
  • Added the file-friendly-logging flag back to the train command. Also added this flag to the predict, evaluate, and find-learning-rate commands.

Removed

  • Removed the opt_level parameter to Model.load and load_archive. In order to use AMP with a loaded
    model now, just run the model's forward pass within torch's autocast
    context.

Commits

73220d7 Prepare for release v1.1.0rc2
9415350 Update torch requirement from <1.6.0,>=1.5.0 to >=1.5.0,<1.7.0 (#4519)
146bd9e Remove link to self-attention modules. (#4512)
2401282 add back file-friendly-logging flag (#4509)
54e5c83 closes #4494 (#4508)
fa39d49 ensure call methods are rendered in docs (#4522)
e53d185 Bug fix for case when param type is Optional[Union...] (#4510)
14f63b7 Make sure we have a bool tensor where we expect one (#4505)
18a4eb3 add a requires_grad option to param groups (#4502)
6c848df Bump mkdocs-material from 5.4.0 to 5.5.0 (#4507)
d73f8a9 More BART changes (#4500)
1cab3bf Update beam_search.py (#4462)
478bf46 remove deadlock warning in DataLoader (#4487)
714334a Fix reported loss: Bug fix in batch_loss (#4485)
db20b1f use longer tqdm intervals when output being redirected (#4488)
53eeec1 tick version for nightly releases
d693cf1 PathLike (#4479)
2f87832 only show validation progress bar from main process (#4476)
9144918 Fix reported loss (#4477)
5c97083 fix release link in CHANGELOG and formatting in README