Scripts for easily maintaining your YaCy instance
Go to file
Bofh d88c317d17 Added script calling examples 2020-11-19 22:31:05 +01:00
x initial commit 2020-11-19 22:00:22 +01:00
LICENSE initial commit 2020-11-19 22:00:22 +01:00
README.md Added script calling examples 2020-11-19 22:31:05 +01:00
blacklist-add.sh initial commit 2020-11-19 22:00:22 +01:00
clear-rejected-urls.sh initial commit 2020-11-19 22:00:22 +01:00
optimize-solr.sh initial commit 2020-11-19 22:00:22 +01:00
purge-by-lang.sh initial commit 2020-11-19 22:00:22 +01:00
purge-by-regex.sh initial commit 2020-11-19 22:00:22 +01:00
terminate-all.sh initial commit 2020-11-19 22:00:22 +01:00

README.md

YaCy Maintenance Scripts

Scripts for easily maintaining your YaCy instance

Requirements

  • Using YaCy on docker
  • Using "yacynet" as docker container name
  • Default Solr collection must be called "collection1"

This requirements are basic to make this scripts work right out of the box. You can make any modifications to the scripts to suit your needs (in case you don't use docker or have different parameters)

Installation

Just clone the repo somewhere on your server

Usage

You must mount the directory of this project onto your docker container on /scripts directory.

You can also locate it wherever you want and change the wrapper scripts to suit your needs.

 

Execute the wrapper scripts located at x/, depending on the action you want to do.

This scripts can also be called into a cron job directly

Usage Examples!

./x/blist-add-alldomain google.com          # block entire google.com domain
./x/blist-add-domainonly www.google.es      # block only the given domain/subdomain
./x/clear-rejected-urls                     # clear all rejected URLS
./x/purge-by-lang fr                        # purge all URLs by the given language code
./x/purge-by-lang de
./x/purge-by-lang ru
./x/purge-by-regex amazon.com               # purges all URLs that contains the given string in the DOMAIN part of the URL
./x/terminate-all                           # terminate all crawling tasks right now! (starts different crawling jobs if Auto-Crawl is ON)