- Added voltage monitoring table and storage pipeline - Extended pool payload to 17 bytes with VCC field (protocol v2) - Improved database connection pool resilience (reduced pool size, aggressive recycling, pool disposal on failure) - Added environment variable support for database configuration - Fixed receiver MQTT deprecation warning (CallbackAPIVersion.VERSION2) - Silenced excessive RSSI status logging in receiver - Added reset flag tracking and reporting - Updated Docker compose with DB config and log rotation limits
5.4 KiB
5.4 KiB
Database Connectivity Issues - Analysis & Fixes
Problem Summary
The NAS container experiences intermittent database connectivity failures with the error:
Exception during reset or similar
_mysql_connector.MySQLInterfaceError: Lost connection to MySQL server during query
While Docker for Desktop works reliably and MySQL Workbench can connect without issues.
Root Causes Identified
1. Aggressive Connection Pool Settings
- Old config:
pool_size=5+max_overflow=10= up to 15 simultaneous connections - Problem: Creates excessive connections that exhaust database resources or trigger connection limits
- Result: Pool reset failures when trying to return/reset dead connections
2. Insufficient Connection Recycling
- Old config:
pool_recycle=1800(30 minutes) - Problem: Connections held too long; database may timeout/close them due to
wait_timeoutor network issues - Result: When SQLAlchemy tries to reuse connections, they're already dead
3. Conflicting autocommit Setting
- Old config:
autocommit=Truein connect_args - Problem: When autocommit is enabled, there's nothing to rollback, but SQLAlchemy still tries during pool reset
- Result: Rollback fails on dead connections → traceback logged
4. Pool Reset on Dead Connections
- Config:
pool_reset_on_return="none"(correct) but didn't dispose pool on failure - Problem: When a connection dies, the pool kept trying to reuse it
- Result: Repeated failures until the next retry window (30 seconds)
5. Network/Database Timeout Issues (NAS-specific)
- Likely cause: NAS MariaDB has aggressive connection timeouts
- Or: Container network has higher packet loss/latency than Docker Desktop
- Or: Pool exhaustion prevents new connections from being established
Applied Fixes
✅ Fix 1: Conservative Connection Pool (Lines 183-195)
pool_size=3, # Reduced from 5
max_overflow=5, # Reduced from 10
pool_recycle=300, # Reduced from 1800 (every 5 mins vs 30 mins)
autocommit=False, # Removed - let SQLAlchemy manage transactions
Why this works:
- Fewer simultaneous connections = less resource contention
- Aggressive recycling = avoids stale connections killed by database
- Proper transaction management = cleaner rollback handling
✅ Fix 2: Pool Disposal on Connection Failure (Lines 530-533)
except exc.OperationalError as e:
sql_engine.dispose() # ← CRITICAL: Force all connections to be closed/recreated
logger.warning(f"Lost database connectivity: {e}")
Why this works:
- When connection fails, dump the entire pool
- Next connection attempt gets fresh connections
- Avoids repeated failures trying to reuse dead connections
✅ Fix 3: Environment Variable Support (Lines 169-175)
DB_HOST = os.getenv("DB_HOST", "192.168.43.102")
DB_PORT = int(os.getenv("DB_PORT", "3306"))
# ... etc
Why this matters:
- Different deployments can now use different database hosts
- Docker Desktop can use
192.168.43.102 - NAS can use
mariadb(Docker DNS) or different IP if needed
Recommended MariaDB Configuration
The NAS MariaDB should have appropriate timeout settings:
-- Check current settings
SHOW VARIABLES LIKE 'wait_timeout';
SHOW VARIABLES LIKE 'interactive_timeout';
SHOW VARIABLES LIKE 'max_connections';
SHOW VARIABLES LIKE 'max_allowed_packet';
-- Recommended settings (in /etc/mysql/mariadb.conf.d/50-server.cnf)
[mysqld]
wait_timeout = 600 # 10 minutes (allow idle connections longer)
interactive_timeout = 600
max_connections = 100 # Ensure enough for pool + workbench
max_allowed_packet = 64M
Deployment Instructions
For Docker Desktop:
# Use default or override in your compose
docker-compose -f docker-compose.yml up
For NAS:
Update your docker-compose or environment file:
environment:
- DB_HOST=192.168.43.102 # or your NAS's actual IP/hostname
- DB_PORT=3306
- DB_USER=weatherdata
- DB_PASSWORD=cfCU$swM!HfK82%*
- DB_NAME=weatherdata
- DB_CONNECT_TIMEOUT=5
Monitoring
The application now logs database configuration at startup:
DB config: host=192.168.43.102:3306, user=weatherdata, db=weatherdata
Monitor the logs for:
- "Database reachable again" → Connection recovered
- "Lost database connectivity" → Transient failure detected and pool disposed
- "Stored batch locally" → Data queued to SQLite while DB unavailable
Testing
Test 1: Verify Environment Variables
# Run container with override
docker run -e DB_HOST=test-host ... python datacollector.py
# Check log: "DB config: host=test-host:3306"
Test 2: Simulate Connection Loss
# In Python shell connected to container
import requests
requests.get('http://container:port/shutdown') # Reconnect simulation
# Should see: "Database still unreachable" → "Database reachable again"
Test 3: Monitor Pool State
Enable pool logging:
echo_pool=True # Line 195 in datacollector.py
Expected Behavior After Fix
- ✅ Connection pool adapts to transient failures
- ✅ Stale connections are recycled frequently
- ✅ Pool is disposed on failure to prevent cascading errors
- ✅ Different environments can specify different hosts
- ✅ Data is cached locally if database is temporarily unavailable