Skip to content

Commit

Permalink
added crawler config option to arguments
Browse files Browse the repository at this point in the history
according to https://github.com/webrecorder/browsertrix-crawler#yaml-crawl-config the crawler can be configured with a yaml config files
which gives more options to configure the crawler to your needs without implementing all the options into zimit.py.
  • Loading branch information
f0sh authored and rgaudin committed Jul 17, 2023
1 parent 57e2f41 commit 95c27ba
Showing 1 changed file with 7 additions and 0 deletions.
7 changes: 7 additions & 0 deletions zimit.py
Original file line number Diff line number Diff line change
Expand Up @@ -317,6 +317,12 @@ def zimit(args=None):
help="If set, output stats as JSON to this file",
)

parser.add_argument(
"--config",
help="Path to YAML config file. If set, browsertrix-crawler will use this file"
"to configure the crawling behaviour if not set via argument.",
)

zimit_args, warc2zim_args = parser.parse_known_args(args)

# pass url and output to warc2zim also
Expand Down Expand Up @@ -488,6 +494,7 @@ def get_node_cmd_line(args):
"timeLimit",
"healthCheckPort",
"overwrite",
"config",
]:
value = getattr(args, arg)
if value:
Expand Down

0 comments on commit 95c27ba

Please sign in to comment.