UW-SASWE · pritamd47 · Sep 9, 2024 · Aug 24, 2024 · Aug 24, 2024 · Aug 24, 2024
diff --git a/docs/Configuration/rat_config.md b/docs/Configuration/rat_config.md
@@ -808,7 +808,7 @@ This section of the configuration file describes the parameters defined by `rout
          If `clean_previous_outputs` is `True`, the previous outputs are cleaned before executing any step in `steps`.
 
     !!! tip_note "Tip"
-        You should use `clean_previous_outputs` if you want to have fresh outputs of RAT for a river basin. Otherwise, by default RAT will keep appending the new outputs to the same files and will concatenate data by calendar dates.
+        You should use `clean_previous_outputs` if you want to have fresh outputs of RAT for a river basin. Otherwise, by default RAT will keep appending the new outputs to the same files and will concatenate data by calendar dates. In case the new outputs and previous outputs have some coinciding dates, RAT will replace the previous outputs with the new outputs for these dates.
 
 ### Confidential
 * <h6 class="parameter_heading">*`secrets:`* :</h6> 

diff --git a/docs/Development/PatchNotes.md b/docs/Development/PatchNotes.md
@@ -1,5 +1,17 @@
 # Patch Notes
 
+### v3.0.14
+In this release, we have:
+
+1. Enhanced Low-Latency Functionality: RAT can now run in operational mode with significantly reduced latency as low as 0.  
+2. Updated Forecasting Plugin: The forecasting plugin has been upgraded to allow forecast generation for multiple past dates.  
+3. Updated IMERG Precipitation Web Links: The web links for downloading historical IMERG data (prior to 2024) have been updated to match those for current data. This change reflects the revision of the IMERG product version for historical data to V07B, which is now the same as the version for recent IMERG data. These updates were implemented on June 1, 2024, on the IMERG web servers.
+4. Updated RAT documentation: to reflect the changes in the forecasting plugin and the possibility of using low latency in operational mode.
+
+!!!note
+    1. Previously, a latency of 3 or more days was recommended due to delays in retrieving meteorological data from servers. However, RAT can now operate with latencies of less than 3 days, including real-time data (0-day latency). This is a major improvement over earlier versions, enabling users to generate data for the current day and produce forecasts up to 15 days ahead from the current day.
+    2. Previously, forecasts could only be generated for the final date of the RAT run, which worked well for operational use. Now, for case studies and research purposes, users can generate forecasts for several historical dates, offering greater flexibility and utility.
+
 ### v3.0.13
 In this release, we have:
 
@@ -29,7 +41,9 @@ In this release, we have:
 
 ### v3.0.8
 
-In this release, we have added rat.toolbox module to contain helpful and utility functions of user. Right now it has one function related to config and that is to update an existing config. Future work will be to add more functions to this module like to create config, or to plot outputs etc.
+In this release, we have:
+
+1. Added rat.toolbox module to contain helpful and utility functions of user. Right now it has one function related to config and that is to update an existing config. Future work will be to add more functions to this module like to create config, or to plot outputs etc.
 
 ### v3.0.7
 

diff --git a/docs/Development/RecentAdjustments.md b/docs/Development/RecentAdjustments.md
@@ -1,5 +1,10 @@
 # Recent Adjustments
 
+### September8, 2024
+From June 1 in 2024,all the historical IMERG data has been revised to Version 07B along with the current IMERG data being generated daily. In the last patch we updated the link to download IMERG for recent and current data while the link for the past historic data was not updated to V07B. This was raising the 'Connection Reset' error if one tries to run RAT from say 2020 to 2024. The problem has been resolved in the [developer version](../../Development/DeveloperVersion/) of RAT. It has been released in the version of RAT [v3.0.14](../../Development/PatchNotes/#v3014). To [update](https://conda.io/projects/conda/en/latest/commands/update.html) RAT, please use `conda update rat` in your RAT environment. 
+
+Due to the non-availability of IMERG data for April and May 2024 on the IMERG web server, users may encounter issues when running RAT for time periods that include these months. This issue is on the IMERG data provider's side, and we are hopeful it will be resolved soon.
+
 ### June 30, 2024
 There is a change in the required permissions for Google Service Accounts to access Earth Engine API.  
 

diff --git a/docs/Plugins/Forecasting.md b/docs/Plugins/Forecasting.md
@@ -9,12 +9,17 @@ To run the forecast plugin, set the value of the `forecast` option in the PLUGIN
 PLUGINS: 
 	forecast: True 
 	forecast_lead_time: 15
-	forecast_start_time: end_date     # can either be “end_date” or a date in YYYY-MM-DD format
-	forecast_rule_curve_dir: /path/to/rule_curve
+	forecast_gen_start_date: end_date     # can either be “end_date” or a date in YYYY-MM-DD format
+	forecast_gen_end_date: 				# a date in YYYY-MM-DD format (Optional)
+	forecast_rule_curve_dir: /path/to/rule_curve   (Optional)
 	forecast_reservoir_shpfile_column_dict: {column_id: GRAND_ID, column_capacity: CAP_MCM}
+	forecast_vic_init_state: 			# path of VIC state file or date in YYYY-MM-DD format (Optional)
+	forecast_rout_init_state: 			# path of Routing state file or date in YYYY-MM-DD format (Optional)
+	forecast_storage_scenario: [ST, GO, CO]
+  	forecast_storage_change_percent_of_smax: [20, 10, 5]
 ```
 
-The `forecast_start_time` option controls when the forecast will begin. If the value is set to `end_date`, the forecast will begin on the end date of RAT’s normal mode of running, i.e., in nowcast mode, which are controlled by `start_date` and `end_date` options in the BASIN section. Alternatively, a date in the YYYY-MM-DD format can also be provided to start the forecast from that date. The forecast window or the number of days ahead for which the forecast is generated is controlled by the `forecast_lead_time` option, with a maximum of 15 days ahead. The `forecast_rule_curve_dir` option should point to the directory containing the rule curve files for the reservoir. The `forecast_reservoir_shpfile_column_dict` option specifies the names of the columns that correspond to the ID and capacity of the reservoir, named `column_id` and `column_capacity`. These columns should be present in the [`reservoir_vector_file` in the GEE section](../../Configuration/rat_config/#gee) in the configuration file for running the forecasting plugin. 
+The `forecast_gen_start_date` option controls when the generation of forecast will begin. If the value is set to `end_date`, the forecast will begin on the end date of RAT’s normal mode of running, i.e., in nowcast mode, which are controlled by `start_date` and `end_date` options in the BASIN section. Alternatively, a date in the YYYY-MM-DD format can also be provided to start the generation of forecast from that date. The `forecast_gen_end_date` option defines the end date for which forecasts will be generated. The forecast window for each date when the forecast is being generated or the number of days ahead for which the forecast is generated is controlled by the `forecast_lead_time` option, with a maximum of 15 days ahead.  The `forecast_rule_curve_dir` option should point to the directory containing the rule curve files for the reservoir. The `forecast_reservoir_shpfile_column_dict` option specifies the names of the columns that correspond to the ID and capacity of the reservoir, named `column_id` and `column_capacity`. These columns should be present in the [`reservoir_vector_file` in the GEE section](../../Configuration/rat_config/#gee) in the configuration file for running the forecasting plugin. The `forecast_vic_init_state` option can be used to define the date in 'YYYY-MM-DD' format of vic init file to be used. It can also be the path of the vic init state file that the user want to use. In case the path of vic init state file is provided instead of date, it is assumed that the state file is for the `forecast_gen_start_date`. Also, in this case, `forecast_rout_init_state` becomes a required parameter and has to be the actual path of the rout init state file.
 
 !!!note
 	The files in the `forecast_rule_curve_dir` should be named according to their IDs, corresponding to the values in `column_id`. For instance, if the id of a reservoir is 7001, then the file name of the reservoir's rule curve should be `7001.txt`. The rule curve files should be in `.txt` format with the following columns - `Month,S/Smax`, corresponding to the month and the storage level as a fraction of the maximum storage level. Rule curve files for reservoirs in the [GRAND](https://www.globaldamwatch.org/grand) database can be downloaded [here](https://www.dropbox.com/scl/fi/jtquzasjdv2tz1vtupgq5/rc.zip?rlkey=4svbutjd3aup255pnrlgnxbkl&dl=0). 

diff --git a/src/rat/core/run_metsim.py b/src/rat/core/run_metsim.py
@@ -38,17 +38,28 @@ def convert_to_vic_forcings(self, forcings_dir):
         log.debug(f"Will create {len(years)} forcing files")
         for year, ds, p in zip(years, dataset, paths):
             if os.path.isfile(p):
-                existing = xr.open_dataset(p)#.load()
-                existing.close()
-
-                log.debug(f"Writing file for year {year}: {p} -- Updating existing")
-                # xr.merge([existing, ds], compat='override', join='outer').to_netcdf(p)
-                # xr.concat([existing, ds], dim='time').to_netcdf(p)
-                last_existing_time = existing.time[-1]
-                log.debug("Existing data: %s", last_existing_time)
+                # Open the existing NetCDF file
+                with xr.open_dataset(p) as existing:
+                    last_existing_time = existing.time[-1]
+                    log.debug(f"Writing file for year {year}: {p} -- Updating existing")
+                    log.debug("Existing data: %s", last_existing_time)
+
+                    # Load the data into memory before closing the dataset
+                    existing_data = existing.sel(
+                        time=slice(None, last_existing_time)
+                    ).load()  # Load the data into memory
+
+                # Select the new data that needs to be appended
                 ds = ds.sel(time=slice(last_existing_time + np.timedelta64(6,'h'), ds.time[-1]))
-                #ds = ds.isel(time=slice(1, None))
-                xr.merge([existing, ds]).to_netcdf(p)
+
+                # Merge the existing data with the new data and save it back to the file
+                xr.merge([existing_data, ds]).to_netcdf(p)
+
+                # # Explicitly delete variables to free up memory
+                # del existing_data, ds
+
+                # # Manually trigger garbage collection to ensure memory is freed
+                # gc.collect()
             else:
                 log.debug(f"Writing file for year {year}: {p} -- Updating new")
                 ds.to_netcdf(p)
diff --git a/src/rat/core/run_postprocessing.py b/src/rat/core/run_postprocessing.py
@@ -129,7 +129,7 @@ def calc_E(res_data, start_date, end_date, forcings_path, vic_res_path, sarea, s
             # Concat the two dataframes into a new dataframe holding all the data (memory intensive):
             complement = pd.concat([existing_data, new_data], ignore_index=True)
             # Remove all duplicates:
-            complement.drop_duplicates(subset=['time'],inplace=True, keep='first')
+            complement.drop_duplicates(subset=['time'],inplace=True, keep='last')
             complement.to_csv(savepath, index=False)
         else:
             data[['time', 'penman_E']].rename({'penman_E': 'OUT_EVAP'}, axis=1).to_csv(savepath, index=False)

diff --git a/src/rat/core/run_routing.py b/src/rat/core/run_routing.py
@@ -152,7 +152,7 @@ def generate_inflow(src_dir, dst_dir):
                 # Concat the two dataframes into a new dataframe holding all the data (memory intensive):
                 complement = pd.concat([existing_data, new_data], ignore_index=True)
                 # Remove all duplicates:
-                complement.drop_duplicates(subset=['date'], inplace=True, keep='first')
+                complement.drop_duplicates(subset=['date'], inplace=True, keep='last')
                 complement.sort_values(by='date', inplace=True)
                 complement.to_csv(outpath, index=False)
             else:

diff --git a/src/rat/core/run_vic.py b/src/rat/core/run_vic.py
@@ -45,7 +45,7 @@ def generate_routing_input_state(self, ndays, rout_input_state_file, save_path,
             first_existing_time = new_vic_output.time[0]
             new_vic_output.close()
 
-            #Preprocessing function for merging netcdf files
+            #Preprocessing function for merging netcdf files by removing coinciding dates from the first and using those values from latest
             def _remove_coinciding_days(ds, cutoff_time, ndays):
                 file_name = ds.encoding["source"]
                 file_stem = Path(file_name).stem
@@ -55,7 +55,7 @@ def _remove_coinciding_days(ds, cutoff_time, ndays):
                     return ds
             remove_coinciding_days_func = partial(_remove_coinciding_days, cutoff_time=first_existing_time, ndays=ndays)
 
-            # Merging previous and new vic outputs
+            # Merging previous and new vic outputs by taking 365 days data in state file (previous vic outputs) before the first date in new vic output. So coinciding data will be removed from state file.
             try:
                 save_vic_output = xr.open_mfdataset([rout_input_state_file,self.vic_result],{'time':365}, preprocess=remove_coinciding_days_func)
                 save_vic_output.to_netcdf(save_path)