The error
ERROR: epoll_ctl() failed: Invalid argument
may be emitted by a pglogical apply worker after it loses its connection to the walsender on the other end.
For example:
ERROR: epoll_ctl() failed: Invalid argument
LOG: worker process: bdr apply cluster-node-1 to cluster-node-2 (PID 1016) exited with exit code 1
This error has been replaced with a more informative message in pglogical 2.2.0. It will now emit:
ERROR: connection to other side has died
A network disconnection occurred, interrupting the walsender to apply worker link.
A number of causes are possible:
- Network issues: intermittent connections, packet loss, routing problems, NAT connection tracking timeouts, MTU issues, etc
- Firewall and intrusion detection system activity
- A known PostgreSQL bug that causes unnecessary walsender timeouts during logical decoding. See related articles linked to this solution.
each of which must be investigated separately in a process of elimination.