Node Troubleshooting

Node Won't Start

Symptom: node.sh ela start runs but the process immediately exits.

Checks:

# Check if another instance is already running
pgrep -x ela

# Check if ports are in use
lsof -i :20336
lsof -i :20338

# Check the error output
cat ~/node/ela/output

# Check system logs
journalctl --user -n 50

# Verify binary is executable
file ~/node/ela/ela
chmod a+x ~/node/ela/ela

# Check file descriptor limits
ulimit -n
# Should be 40960 or higher

Common causes:

Port already in use by another process → kill the conflicting process
Corrupted chain data → remove ~/node/ela/elastos/data/ and re-sync
Insufficient permissions → check file ownership
Missing dependencies → run sudo apt-get install -y jq lsof apache2-utils

Sync Stalled

Symptom: Block height stops increasing.

# Check current height
node.sh ela jsonrpc getcurrentheight

# Check peer count
node.sh ela jsonrpc getconnectioncount

# If peers = 0, check firewall
sudo ufw status
# Ensure port 20338 is open

# Check if the process is alive but stuck
node.sh ela status
# Look at uptime and RAM usage

# Check for disk space
df -h ~/node

# Force reconnection by restarting
node.sh ela stop
sleep 5
node.sh ela start

ESC/EID sync stall:

# Check if PBFT is stuck
curl -s http://127.0.0.1:20636 \
    -d '{"jsonrpc":"2.0","method":"eth_syncing","params":[],"id":1}' | jq

# Check peer count
curl -s http://127.0.0.1:20636 \
    -d '{"jsonrpc":"2.0","method":"net_peerCount","params":[],"id":1}' | jq

# Restart with clean peer cache
node.sh esc stop
rm -rf ~/node/esc/data/geth/nodes/
node.sh esc start

High Resource Usage

High CPU:

# Identify which process is consuming CPU
top -p $(pgrep -d',' -x ela),$(pgrep -f esc),$(pgrep -f eid)

# ELA high CPU during sync is normal
# After sync, CPU should drop to near-zero between blocks

High RAM:

# Check per-process memory
node.sh ela status  # Shows RAM usage

# ESC/EID geth forks can consume significant RAM
# Reduce cache if needed:
# Edit the start command to add --cache 512 (default is 1024)

High disk I/O:

# Check which process is writing
iotop -p $(pgrep -x ela)

# During initial sync, high I/O is expected
# After sync, I/O should be minimal (only new blocks)

RPC Errors

"Connection refused" on RPC port:

# Verify the process is running
pgrep -x ela

# Verify RPC is enabled in config
jq '.Configuration.EnableRPC' ~/node/ela/config.json

# Check what address/port RPC is binding to
lsof -i :20336 -P

# Check RPC credentials
jq '.Configuration.RpcConfiguration' ~/node/ela/config.json

"Unauthorized" on RPC calls:

# Your RPC credentials are in config.json
USER=$(jq -r '.Configuration.RpcConfiguration.User' ~/node/ela/config.json)
PASS=$(jq -r '.Configuration.RpcConfiguration.Pass' ~/node/ela/config.json)

# Use ela-cli with credentials
~/node/ela/ela-cli --rpcuser "$USER" --rpcpassword "$PASS" --rpcport 20336 info getcurrentheight

"IP not in whitelist":

# Add your IP to the whitelist
jq '.Configuration.RpcConfiguration.WhiteIPList += ["YOUR_IP"]' \
    ~/node/ela/config.json > /tmp/config.json && \
    mv /tmp/config.json ~/node/ela/config.json

# Restart ELA
node.sh ela stop && node.sh ela start

BPoS Issues

"InActive" status after registration:

# Check if the node is fully synced first
node.sh ela jsonrpc getcurrentheight
# Compare against https://blockchain.elastos.io

# Check registration status
node.sh ela status
# Look at BPoS section

# Try reactivation
node.sh ela activate_bpos

Missing blocks / penalties:

# Check node uptime
node.sh ela status  # Look at Uptime

# Verify BPoS port is accessible from outside
# From another server:
nc -zv YOUR_SERVER_IP 20339

# Check clock synchronization
timedatectl status
# Install NTP if not synced:
sudo apt-get install -y chrony
sudo systemctl enable chrony

Cross-Chain Failures

ESC deposits not arriving:

# Check arbiter status
node.sh arbiter status

# Check oracle status
node.sh esc-oracle status

# Verify all intermediate components are running
node.sh status

# Check arbiter logs for errors
tail -100 ~/node/arbiter/logs/*.log | grep -i error

Log Analysis

Log locations:

Chain	Log Directory	Log Pattern
ELA (node)	`~/node/ela/elastos/logs/node/`	Timestamped entries
ELA (BPoS)	`~/node/ela/elastos/logs/dpos/`	Consensus-specific logs
ESC	`~/node/esc/logs/`	`esc-YYYY-MM-DD-HH_MM_SS.log`
EID	`~/node/eid/logs/`	`eid-YYYY-MM-DD-HH_MM_SS.log`
PG	`~/node/pg/logs/`	`pg-YYYY-MM-DD-HH_MM_SS.log`
ESC-Oracle	`~/node/esc-oracle/logs/`	`esc-oracle_out-*.log`, `esc-oracle_err.log`
EID-Oracle	`~/node/eid-oracle/logs/`	`eid-oracle_out-*.log`, `eid-oracle_err.log`
Arbiter	`~/node/arbiter/logs/`	Timestamped entries

Common log grep patterns:

# Find errors in ELA logs
grep -i "error\|panic\|fatal" ~/node/ela/elastos/logs/node/*.log | tail -20

# Find BPoS consensus issues
grep -i "timeout\|failed\|reject" ~/node/ela/elastos/logs/dpos/*.log | tail -20

# Find ESC sync issues
grep -i "error\|peer\|disconnect" ~/node/esc/logs/esc-*.log | tail -20

# Find cross-chain issues in arbiter
grep -i "error\|failed\|timeout" ~/node/arbiter/logs/*.log | tail -20

# Find oracle errors
cat ~/node/esc-oracle/logs/esc-oracle_err.log

Diagnostic Commands

Full diagnostic dump:

#!/bin/bash
echo "=== System Info ==="
uname -a
free -h
df -h ~/node
uptime

echo "=== Process Status ==="
node.sh status

echo "=== ELA Chain State ==="
node.sh ela jsonrpc getcurrentheight
node.sh ela jsonrpc getconnectioncount
node.sh ela jsonrpc getbestblockhash

echo "=== ESC Chain State ==="
curl -s http://127.0.0.1:20636 \
    -d '{"jsonrpc":"2.0","method":"eth_blockNumber","params":[],"id":1}'
curl -s http://127.0.0.1:20636 \
    -d '{"jsonrpc":"2.0","method":"net_peerCount","params":[],"id":1}'

echo "=== Port Bindings ==="
lsof -i -P | grep -E '203[0-9]{2}'

echo "=== Recent Errors ==="
grep -i error ~/node/ela/elastos/logs/node/*.log 2>/dev/null | tail -5
grep -i error ~/node/esc/logs/esc-*.log 2>/dev/null | tail -5

echo "=== Disk Usage ==="
du -sh ~/node/*/

Node Won't Start​

Sync Stalled​

High Resource Usage​

RPC Errors​

BPoS Issues​

Cross-Chain Failures​

Log Analysis​

Diagnostic Commands​