- Make sure we set the "hasShutdown" to "false", for each known worker which was restarted.

- Fix markdown of urls in prometheus' readme.
This commit is contained in:
Lampros Smyrnaios 2023-05-16 12:24:14 +03:00
parent b499209ce3
commit f7f919cee1
3 changed files with 10 additions and 5 deletions

View File

@ -6,16 +6,16 @@
### Check the metrics
- To check the raw metrics hit this url in your browser: http://<IP>:1880/api/actuator/prometheus
- To check the metrics though GUI, with some graphs, hit: http://<IP>:9090
- To check the raw metrics check the output of this url, in your browser: http://\<IP\>:1880/api/actuator/prometheus
- To check the metrics though GUI, with some graphs, check: http://\<IP\>:9090
### Visualize metrics in Grafana
- Access grafana in: http://<IP>:3000/
- Access grafana in: http://\<IP\>:3000/
- Give the default username and password: "admin" (for both).
- Specify the new password.
- Then, add a "Prometheus" datasource.
- Specify the prometheus url: http://<IP>:9090
- Specify the prometheus url: http://\<IP\>:9090
- Save the datasource.
- Go to "dashboards", click "import", add the number "12900" in the input and click "load" and "next".
- Then select the "prometheus" datasource and click "import".

View File

@ -14,5 +14,5 @@ scrape_configs:
metrics_path: '/api/actuator/prometheus' # Job to scrape application metrics
scrape_interval: 15s
scrape_timeout: 10s
static_configs: # TODO - Find a way to automatically apply the publicIP of the host machine it is running on.
static_configs: # TODO - Check if there is a way to automatically apply the publicIP of the host machine it is running on.
- targets: [ '<SERVER_IP>:1880' ]

View File

@ -91,6 +91,11 @@ public class UrlsController {
if ( savedWorkerIp.equals(remoteAddr) ) {
logger.warn("The worker with id \"" + workerId + "\" has changed IP from \"" + savedWorkerIp + "\" to \"" + remoteAddr + "\".");
workerInfo.setWorkerIP(remoteAddr); // Set the new IP. The update will be reflected in the map.
} // In this case, the worker may has previously informed the Controller it has shutdown or it may have crashed.
if ( workerInfo.getHasShutdown() ) {
logger.info("The worker with id \"" + workerId + "\" was restarted.");
workerInfo.setHasShutdown(false);
}
} else {
logger.info("The worker \"" + workerId + "\" is requesting assignments for the first time. Going to store its IP and create the remote parquet subdirectories (in HDFS).");