Scripts for easily maintaining your YaCy instance
You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
Bastard Operator From Hell 5e54fded30 Add all langs to the example on README 5 days ago
x Croned job not working because shell is not a tty 5 days ago
LICENSE initial commit 5 days ago
README.md Add all langs to the example on README 5 days ago
blacklist-add.sh initial commit 5 days ago
clear-rejected-urls.sh initial commit 5 days ago
optimize-solr.sh initial commit 5 days ago
purge-by-lang.sh initial commit 5 days ago
purge-by-regex.sh initial commit 5 days ago
terminate-all.sh initial commit 5 days ago

README.md

YaCy Maintenance Scripts

Scripts for easily maintaining your YaCy instance

Requirements

  • Using YaCy on docker
  • Using "yacynet" as docker container name
  • Default Solr collection must be called "collection1"

This requirements are basic to make this scripts work right out of the box. You can make any modifications to the scripts to suit your needs (in case you don't use docker or have different parameters)

Installation

Just clone the repo somewhere on your server

Usage

You must mount the directory of this project onto your docker container on /scripts directory.

You can also locate it wherever you want and change the wrapper scripts to suit your needs.

 

Execute the wrapper scripts located at x/, depending on the action you want to do.

This scripts can also be called into a cron job directly

Usage Examples!

./x/blist-add-alldomain google.com          # block entire google.com domain
./x/blist-add-domainonly www.google.es      # block only the given domain/subdomain
./x/clear-rejected-urls                     # clear all rejected URLS

########## ALL lang codes: ar bg ca cz da de el en es eu fa fi fr ga gl hi hu hy id it ja lv nl no pt ro ru sv th tr
./x/purge-by-lang fr                        # purge all URLs by the given language code (example: france)

./x/purge-by-regex amazon.com               # purges all URLs that contains the given string in the DOMAIN part of the URL
./x/terminate-all                           # terminate all crawling tasks right now! (starts different crawling jobs if Auto-Crawl is ON)