Node Troubleshooting
Node Won't Start
Symptom: node.sh ela start runs but the process immediately exits.
Checks:
# Check if another instance is already running
pgrep -x ela
# Check if ports are in use
lsof -i :20336
lsof -i :20338
# Check the error output
cat ~/node/ela/output
# Check system logs
journalctl --user -n 50
# Verify binary is executable
file ~/node/ela/ela
chmod a+x ~/node/ela/ela
# Check file descriptor limits
ulimit -n
# Should be 40960 or higher
Common causes:
- Port already in use by another process → kill the conflicting process
- Corrupted chain data → remove
~/node/ela/elastos/data/and re-sync - Insufficient permissions → check file ownership
- Missing dependencies → run
sudo apt-get install -y jq lsof apache2-utils
Sync Stalled
Symptom: Block height stops increasing.
# Check current height
node.sh ela jsonrpc getcurrentheight
# Check peer count
node.sh ela jsonrpc getconnectioncount
# If peers = 0, check firewall
sudo ufw status
# Ensure port 20338 is open
# Check if the process is alive but stuck
node.sh ela status
# Look at uptime and RAM usage
# Check for disk space
df -h ~/node
# Force reconnection by restarting
node.sh ela stop
sleep 5
node.sh ela start
ESC/EID sync stall:
# Check if PBFT is stuck
curl -s http://127.0.0.1:20636 \
-d '{"jsonrpc":"2.0","method":"eth_syncing","params":[],"id":1}' | jq
# Check peer count
curl -s http://127.0.0.1:20636 \
-d '{"jsonrpc":"2.0","method":"net_peerCount","params":[],"id":1}' | jq
# Restart with clean peer cache
node.sh esc stop
rm -rf ~/node/esc/data/geth/nodes/
node.sh esc start
High Resource Usage
High CPU:
# Identify which process is consuming CPU
top -p $(pgrep -d',' -x ela),$(pgrep -f esc),$(pgrep -f eid)
# ELA high CPU during sync is normal
# After sync, CPU should drop to near-zero between blocks
High RAM:
# Check per-process memory
node.sh ela status # Shows RAM usage
# ESC/EID geth forks can consume significant RAM
# Reduce cache if needed:
# Edit the start command to add --cache 512 (default is 1024)
High disk I/O:
# Check which process is writing
iotop -p $(pgrep -x ela)
# During initial sync, high I/O is expected
# After sync, I/O should be minimal (only new blocks)
RPC Errors
"Connection refused" on RPC port:
# Verify the process is running
pgrep -x ela
# Verify RPC is enabled in config
jq '.Configuration.EnableRPC' ~/node/ela/config.json
# Check what address/port RPC is binding to
lsof -i :20336 -P
# Check RPC credentials
jq '.Configuration.RpcConfiguration' ~/node/ela/config.json
"Unauthorized" on RPC calls:
# Your RPC credentials are in config.json
USER=$(jq -r '.Configuration.RpcConfiguration.User' ~/node/ela/config.json)
PASS=$(jq -r '.Configuration.RpcConfiguration.Pass' ~/node/ela/config.json)
# Use ela-cli with credentials
~/node/ela/ela-cli --rpcuser "$USER" --rpcpassword "$PASS" --rpcport 20336 info getcurrentheight
"IP not in whitelist":
# Add your IP to the whitelist
jq '.Configuration.RpcConfiguration.WhiteIPList += ["YOUR_IP"]' \
~/node/ela/config.json > /tmp/config.json && \
mv /tmp/config.json ~/node/ela/config.json
# Restart ELA
node.sh ela stop && node.sh ela start
BPoS Issues
"InActive" status after registration:
# Check if the node is fully synced first
node.sh ela jsonrpc getcurrentheight
# Compare against https://blockchain.elastos.io
# Check registration status
node.sh ela status
# Look at BPoS section
# Try reactivation
node.sh ela activate_bpos
Missing blocks / penalties:
# Check node uptime
node.sh ela status # Look at Uptime
# Verify BPoS port is accessible from outside
# From another server:
nc -zv YOUR_SERVER_IP 20339
# Check clock synchronization
timedatectl status
# Install NTP if not synced:
sudo apt-get install -y chrony
sudo systemctl enable chrony
Cross-Chain Failures
ESC deposits not arriving:
# Check arbiter status
node.sh arbiter status
# Check oracle status
node.sh esc-oracle status
# Verify all intermediate components are running
node.sh status
# Check arbiter logs for errors
tail -100 ~/node/arbiter/logs/*.log | grep -i error
Log Analysis
Log locations:
| Chain | Log Directory | Log Pattern |
|---|---|---|
| ELA (node) | ~/node/ela/elastos/logs/node/ | Timestamped entries |
| ELA (BPoS) | ~/node/ela/elastos/logs/dpos/ | Consensus-specific logs |
| ESC | ~/node/esc/logs/ | esc-YYYY-MM-DD-HH_MM_SS.log |
| EID | ~/node/eid/logs/ | eid-YYYY-MM-DD-HH_MM_SS.log |
| PG | ~/node/pg/logs/ | pg-YYYY-MM-DD-HH_MM_SS.log |
| ESC-Oracle | ~/node/esc-oracle/logs/ | esc-oracle_out-*.log, esc-oracle_err.log |
| EID-Oracle | ~/node/eid-oracle/logs/ | eid-oracle_out-*.log, eid-oracle_err.log |
| Arbiter | ~/node/arbiter/logs/ | Timestamped entries |
Common log grep patterns:
# Find errors in ELA logs
grep -i "error\|panic\|fatal" ~/node/ela/elastos/logs/node/*.log | tail -20
# Find BPoS consensus issues
grep -i "timeout\|failed\|reject" ~/node/ela/elastos/logs/dpos/*.log | tail -20
# Find ESC sync issues
grep -i "error\|peer\|disconnect" ~/node/esc/logs/esc-*.log | tail -20
# Find cross-chain issues in arbiter
grep -i "error\|failed\|timeout" ~/node/arbiter/logs/*.log | tail -20
# Find oracle errors
cat ~/node/esc-oracle/logs/esc-oracle_err.log
Diagnostic Commands
Full diagnostic dump:
#!/bin/bash
echo "=== System Info ==="
uname -a
free -h
df -h ~/node
uptime
echo "=== Process Status ==="
node.sh status
echo "=== ELA Chain State ==="
node.sh ela jsonrpc getcurrentheight
node.sh ela jsonrpc getconnectioncount
node.sh ela jsonrpc getbestblockhash
echo "=== ESC Chain State ==="
curl -s http://127.0.0.1:20636 \
-d '{"jsonrpc":"2.0","method":"eth_blockNumber","params":[],"id":1}'
curl -s http://127.0.0.1:20636 \
-d '{"jsonrpc":"2.0","method":"net_peerCount","params":[],"id":1}'
echo "=== Port Bindings ==="
lsof -i -P | grep -E '203[0-9]{2}'
echo "=== Recent Errors ==="
grep -i error ~/node/ela/elastos/logs/node/*.log 2>/dev/null | tail -5
grep -i error ~/node/esc/logs/esc-*.log 2>/dev/null | tail -5
echo "=== Disk Usage ==="
du -sh ~/node/*/