I'm using metpy.calc.windchill in order to calculate wind chill values, and it automatically spits out an array with a numpy mask on it.
t2m, uwind, and vwind all come from ERA5 (on the Google Cloud: https://cloud.google.com/storage/docs/public-datasets) and are 3D arrays with lat, lon, and time.
t2m = xr.open_dataarray('ERA5_t2m_hourly_2004_2024.nc') vwind = xr.open_dataarray('ERA5_vwind_hourly_2004_2024.nc') uwind = xr.open_dataarray('ERA5_uwind_hourly_2004_2024.nc') def wind_tot(uwind, vwind): wind_mag = np.sqrt(vwind**2 + uwind**2) wind_dir = np.arctan2(vwind/wind_mag, uwind/wind_mag) wind_dir = wind_dir * 180/np.pi return wind_mag, wind_dir wind_mag, wind_dir = wind_tot(uwind, vwind) wind_chill = metpy.calc.windchill( t2m * units.K, wind_mag * units('m/s'), ) windchill_ma = wind_chill.to_masked_array() mask = ma.getmask(windchill_ma)
The mask is a numpy ndarray with boolean values.
I'd like to convert all values that are masked (set as True in the mask) to np.nan while retaining the DataArray structure.
For example:
wind_chill = xr.DataArray(array([[[1,6,4,8],[2,4,3,5]],[[2,5,3,6],[2,6,4,8]]])) mask = array([[[True,True,False,False],[........ masked_wind_chill = xr.DataArray(array([[[NaN,NaN,4,8......
masked_wind_chill = wind_chill.where(~mask)
and other variations of .where(mask=False) and whatnot just end up not applying the mask at all to the data. If I use .where(mask=True), all data points are masked regardless of if the value is True or False for that data point. Am I using mask wrong?
Edit: Implementing arr.filled:
windchill_ma = wind_chill.to_masked_array() masked_windchill = windchill_ma.filled(np.nan) windchill_da = xr.DataArray(masked_windchill, coords={ 'time':wind_chill.time, 'lat':wind_chill.lat, 'lon':wind_chill.lon, }) windchill_da
gives similar results. I found that the methods I'm using are working as intended, but the mask I have doesn't seem to correspond to the areas marked as '--' in the wind_chill dataset, which is what I would like to mask for, since they're above the necessary temperature and below the wind thresholds.
mask = ma.getmask(windchill_ma)
returns an array full of 'False' at timesteps that are marked entirely as '--' by wind_chill, which in turn should be True in the mask, I think. Is the mask I'm using not correct?
Edit 2:
Looking at a specific time step:
I'd like to convert all values that are masked as '--' to np.nan while retaining the DataArray structure.
However, looking at the actual values using wind_chill.sel(time='2020-08-25 T12:00:00').values
, all the values, including the ones marked as --, are existent. As a result, when I plot this timestep, all values are shown, even ones that supposedly should be masked out with --. Methods like .filled() and .fill_na() also are not working for this reason. When I attempt to apply a mask using .to_masked_array()
, the mask also is set to False for all these values marked as '--'. Is there a way to find the values that are being set as -- and create a mask using them so I can set these -- marked values as NaN?
.to_masked_array()
implies you already have a DataArray? So why are you converting to a masked array to do things with the mask, and then converting back to a DataArray?.fillna()
on the DataArray itself to fill the masked/bad values and avoid any need to go back and forth to masked arrays: docs.xarray.dev/en/stable/generated/…