Scripts for easily maintaining your YaCy instance
Go to file
Bofh b38973f949 Fixed purge-by-lang script 2020-12-22 00:00:20 +01:00
x Croned job not working because shell is not a tty 2020-11-19 22:38:50 +01:00
LICENSE initial commit 2020-11-19 22:00:22 +01:00
README.md Add all langs to the example on README 2020-11-19 22:42:43 +01:00
blacklist-add.sh initial commit 2020-11-19 22:00:22 +01:00
clear-rejected-urls.sh initial commit 2020-11-19 22:00:22 +01:00
optimize-solr.sh initial commit 2020-11-19 22:00:22 +01:00
purge-by-lang.sh Fixed purge-by-lang script 2020-12-22 00:00:20 +01:00
purge-by-regex.sh initial commit 2020-11-19 22:00:22 +01:00
terminate-all.sh initial commit 2020-11-19 22:00:22 +01:00

README.md

YaCy Maintenance Scripts

Scripts for easily maintaining your YaCy instance

Requirements

  • Using YaCy on docker
  • Using "yacynet" as docker container name
  • Default Solr collection must be called "collection1"

This requirements are basic to make this scripts work right out of the box. You can make any modifications to the scripts to suit your needs (in case you don't use docker or have different parameters)

Installation

Just clone the repo somewhere on your server

Usage

You must mount the directory of this project onto your docker container on /scripts directory.

You can also locate it wherever you want and change the wrapper scripts to suit your needs.

 

Execute the wrapper scripts located at x/, depending on the action you want to do.

This scripts can also be called into a cron job directly

Usage Examples!

./x/blist-add-alldomain google.com          # block entire google.com domain
./x/blist-add-domainonly www.google.es      # block only the given domain/subdomain
./x/clear-rejected-urls                     # clear all rejected URLS

########## ALL lang codes: ar bg ca cz da de el en es eu fa fi fr ga gl hi hu hy id it ja lv nl no pt ro ru sv th tr
./x/purge-by-lang fr                        # purge all URLs by the given language code (example: france)

./x/purge-by-regex amazon.com               # purges all URLs that contains the given string in the DOMAIN part of the URL
./x/terminate-all                           # terminate all crawling tasks right now! (starts different crawling jobs if Auto-Crawl is ON)