ig-pixelfed-mirror/README.md

202 lines
7.4 KiB
Markdown

# Instagram -> Pixelfed Mirror in Python (3+)
This Python scripts with a simple HTTP API to use it, can **mirror multiple accounts from Instagram to your Pixelfed** instance.
## Disclaimer
Mirroring Instagram accounts **will copy (clone) the images and videos from Instagram** to your server, thus moving the data sovereignty to you as an Admin.
Always make sure you mirror accounts whos **content is not strictly copyrighted (or branded) and keep the tagging system of the source code** that marks accounts on Pixelfed as Mirrors.
**Users** of your Pixelfed instance have to know **this accounts are not real accounts** as someone on Instagram could create a real Pixelfed account in the future.
## Pre-requisites
- You **need an Instagram account** to make this bot work.
- **A Pixelfed instance** you own. It must be hosted or managed by you (you need shell access for **php artisan commands**)
- **Python 3+ installed** on the same machine as the Pixelfed instance (this is easy, mostly all standard Debian based Linux systems have python3)
- A bit **more disk space** for Pixelfed media storage. (It depends on how many accounts you mirror and how often that accounts post on IG)
- **Patch 1 file** on Pixelfed deployed code (being on docker or not)
## What can you mirror?
- Posts with **images** (multiple images support).
- Posts with **videos** (only 1 video).
- Each **posts captions** (post description) is **represented as splitted comments**.
## What can't you mirror (yet) ?
- **Stories**. (will support that soon)
# Installation
### Patch some files
As i said on the Pre-requisites, you will need to **patch the following files on Pixelfed** running code:
This patch is **needed to disable Pixelfed API rate limiting completely** as it might pop-up when using your mirror bot.
We asume your Pixelfed installation is at `/var/www` (if it isn't, just **change the steps to match your installation**)
`/var/www/vendor/laravel/framework/src/Illuminate/Routing/Middleware/ThrottleRequests.php`
Add `return 99999999;` after the line where it checks if the user is authenticated.
```php
/**
* Resolve the number of attempts if the user is authenticated or not.
*
* @param \Illuminate\Http\Request $request
* @param int|string $maxAttempts
* @return int
*/
protected function resolveMaxAttempts($request, $maxAttempts)
{
if (Str::contains($maxAttempts, '|')) {
$maxAttempts = explode('|', $maxAttempts, 2)[$request->user() ? 1 : 0];
}
if (! is_numeric($maxAttempts) && $request->user()) {
return 99999999;
$maxAttempts = $request->user()->{$maxAttempts};
}
return (int) $maxAttempts;
}
```
### Setup the environment
1. Clone the project somewhere `git clone https://git.nogafam.es/nogafam/ig-pixelfed-mirror.git`
2. Create needed directories for the environment
```bash
cd ig-pixelfed-mirror/
mkdir cache db headers
# DIRECTORIES EXPLAINED
# ----
# "cache" will keep the cached html/json/jpg/mp4 files from Instagram (it can be safely cleaned, it doesn't need backup)
# "db" will keep files containing information about your mirrored accounts
# (such as: login credentials, cookies for Pixelfed, IG posts that was already added (to avoid duplicates), etc...)
# "headers" is a directory you need to fill with TXT plain text files containing some key headers from Instagram Web sessions
# (I explain this on step 5!)
```
3. Copy `config.json.example` to `config.json` and configure to your needs. Make sure you **remove the comments from the file** because Python doesn't like them.
4. Copy `scripts/user_create.example` to `scripts/user_create`. This script runs the `php artisan user:create` command with the given positional parameters to **automatically create local Pixelfed accounts** with forced email verification. **Adapt to your needs**.
5. [Log in](https://www.instagram.com) to your Instagram account on a **Browser** (with remember ticked ON), **open your browser's developer console at "Network" tab**, navigate throught IG a bit and **copy the Request Headers** of any request done to _instagram.com_. Make sure there is `X-IG-App-ID, X-IG-WWW-Claim, Cookie` headers set.
```bash
# create a new headers file on "headers/" on your working directory
# you can view an example at "headers.example" file
vim headers/1.txt
# FILE EDITING
# ----
# 1. Add the headers you copied from the browser.
# 2. Keep only the "X-IG-App-ID, X-IG-WWW-Claim, Cookie" headers,
# "User-Agent" might be also good to keep (to match the browser)
# (It uses a common User-Agent by default)
# -
# Done editing
```
Create as much headers TXT files as you wish. It will be **used randomly on requests to Instagram** by the mirror bot.
## Run the server
The server is just a **simple HTTP Python server** that acts as an API to manage your mirrors.
¡¡IMPORTANT: run on the project root path!!
`python3 server.py` will run the server API at 0.0.0.0:8080.
Optionally, you can set a **binding port** with a positional argument like this: `python3 server.py 8081`
## API Documentation
**List** accounts:
`curl 127.0.0.1:8080/list`
**List** accounts in a pretty HTML/CSS interface:
`curl 127.0.0.1:8080/mirrors`
**Add a new Instagram account to mirror**. This is done **syncronously until it finishes adding** account information, once it's done, it **calls update asyncronously**.
`curl 127.0.0.1:8080/<username>/add`
**Update Instagram account/s mirror** on Pixelfed (mirrors new content). This is **done fully asyncronously**
```bash
# Update just the given account
curl 127.0.0.1:8080/<username>/update`
# API supports wildcard to update all accounts mirrors
curl '127.0.0.1:8080/*/update'`
```
**Configure parameters** of the Pixelfed account (on local mirror file-based DB)
```bash
# Get the value of a parameter (except: password, cookie)
curl 127.0.0.1:8080/<username>/cfg/var1
# Set the value of a parameter
# this keys are inmutable: name, username, password, cookie
curl 127.0.0.1:8080/<username>/cfg/var1?valueyouwant
# CONFIGURABLE PARAMETERS:
# ------------------
# "sched" key sets a weight schedule so you can make an account update less often.
#
# For example: if you set the value to "3", only 1 times out of 3 (counted) will actually update the account
# "sched_now" is an automatic variable that counts the current status (used when "sched" is set).
#
# This will skip the account on wildcard or single /update action, until the count hits it's max value ("sched")
# As in the example above (sched = 3), you would set this value like this:
curl 127.0.0.1:8080/<username>/cfg/sched?3
# ------------------
```
**Log in or log out** the account from Pixelfed.
```bash
curl 127.0.0.1:8080/<username>/login
curl '127.0.0.1:8080/*/login'
curl 127.0.0.1:8080/<username>/logout
curl '127.0.0.1:8080/*/logout'
```
In case of repeated posts or disaster, **prune/nuke all Pixelfed posts** from the given account.
`curl 127.0.0.1:8080/<username>/nuke`
## Automatically updating posts of mirrored accounts
You have to **create a cronjob or systemd timer** (or whatever scheduled) to run `curl '127.0.0.1:8080/*/update'`
Please consider **how many accounts you mirror** + **how often you want to update posts** + **how often you want to bother Instagram** servers... A good reference is **a job every 4 hours for more-or-less 20 accounts** to be "completely safe".
## Happy Mirroring!
Any problem/bug/request you might encounter **you may contact me** at https://contact.nogafam.es