How to use numpy masked arrays to create a masked xarray DataArray?

Question

I'm using metpy.calc.windchill in order to calculate wind chill values, and it automatically spits out an array with a numpy mask on it.

t2m, uwind, and vwind all come from ERA5 (on the Google Cloud: https://cloud.google.com/storage/docs/public-datasets) and are 3D arrays with lat, lon, and time.

t2m = xr.open_dataarray('ERA5_t2m_hourly_2004_2024.nc') vwind = xr.open_dataarray('ERA5_vwind_hourly_2004_2024.nc') uwind = xr.open_dataarray('ERA5_uwind_hourly_2004_2024.nc') def wind_tot(uwind, vwind): wind_mag = np.sqrt(vwind**2 + uwind**2) wind_dir = np.arctan2(vwind/wind_mag, uwind/wind_mag) wind_dir = wind_dir * 180/np.pi return wind_mag, wind_dir wind_mag, wind_dir = wind_tot(uwind, vwind) wind_chill = metpy.calc.windchill( t2m * units.K, wind_mag * units('m/s'), ) windchill_ma = wind_chill.to_masked_array() mask = ma.getmask(windchill_ma)

The mask is a numpy ndarray with boolean values.

I'd like to convert all values that are masked (set as True in the mask) to np.nan while retaining the DataArray structure.

For example:

wind_chill = xr.DataArray(array([[[1,6,4,8],[2,4,3,5]],[[2,5,3,6],[2,6,4,8]]])) mask = array([[[True,True,False,False],[........ masked_wind_chill = xr.DataArray(array([[[NaN,NaN,4,8......

masked_wind_chill = wind_chill.where(~mask)

and other variations of .where(mask=False) and whatnot just end up not applying the mask at all to the data. If I use .where(mask=True), all data points are masked regardless of if the value is True or False for that data point. Am I using mask wrong?

Edit: Implementing arr.filled:

windchill_ma = wind_chill.to_masked_array() masked_windchill = windchill_ma.filled(np.nan) windchill_da = xr.DataArray(masked_windchill, coords={ 'time':wind_chill.time, 'lat':wind_chill.lat, 'lon':wind_chill.lon, }) windchill_da

gives similar results. I found that the methods I'm using are working as intended, but the mask I have doesn't seem to correspond to the areas marked as '--' in the wind_chill dataset, which is what I would like to mask for, since they're above the necessary temperature and below the wind thresholds.

mask = ma.getmask(windchill_ma)

returns an array full of 'False' at timesteps that are marked entirely as '--' by wind_chill, which in turn should be True in the mask, I think. Is the mask I'm using not correct?

Edit 2:

Looking at a specific time step:

enter image description here

I'd like to convert all values that are masked as '--' to np.nan while retaining the DataArray structure.

However, looking at the actual values using wind_chill.sel(time='2020-08-25 T12:00:00').values, all the values, including the ones marked as --, are existent. As a result, when I plot this timestep, all values are shown, even ones that supposedly should be masked out with --. Methods like .filled() and .fill_na() also are not working for this reason. When I attempt to apply a mask using .to_masked_array(), the mask also is set to False for all these values marked as '--'. Is there a way to find the values that are being set as -- and create a mask using them so I can set these -- marked values as NaN?

The use of .to_masked_array() implies you already have a DataArray? So why are you converting to a masked array to do things with the mask, and then converting back to a DataArray? — DopplerShift, CommentedMar 7 at 21:15
I'm converting the DataArray to a masked array so that I can access the mask and use arr.filled. I haven't found a way to otherwise apply the mask to the DataArray like with arr.filled, but if there's a way I'd like to know! I then turn the masked dataset back to a DataArray so I can reapply coordinates. — user29903541, CommentedMar 10 at 15:26
I would suggest looking at using .fillna() on the DataArray itself to fill the masked/bad values and avoid any need to go back and forth to masked arrays: docs.xarray.dev/en/stable/generated/… — DopplerShift, CommentedMar 10 at 20:05
I think we might be on different pages here, unfortunately. The way the mask seems to work when I apply metpy.calc.wind_chill is that it returns a '--' when I look at the dataset using wind_chill.sel(time=timestep), but it has actual values when I look at the dataset using wind_chill.sel(time=timestep).values. Therefore, when I apply .fillna(), no changes occur. I would like to make it so all values that are '--' when viewed using wind_chill.sel(time=timestep) are NaN when I use wind_chill.sel(time=timestep).values. I will update the question to show exactly what I'm seeing. — user29903541, CommentedMar 10 at 20:25

DopplerShift · Accepted Answer · 2025-03-05 21:02:52Z

2

Given a masked array, you can fill the masked values with nan using the .filled() method:

import numpy as np arr = np.ma.array([1., 2., 3.], mask=[False, True, False]) filled_arr = arr.filled(np.nan)

gives array([ 1., nan, 3.]).

answered Mar 5 at 21:02

DopplerShift

5,8731 gold badge23 silver badges21 bronze badges

Thank you so much for your response! I tried this out and found that my error seems to actually be in my mask and not the implementation. My mask shows False in areas where the mask should be True. For instance, in the original masked array (here arr), all the values are shown as '--' for a timestep, which I think indicates that they're True in the mask. Yet at that timestep in the mask, all the values are False and so nothing gets filled with NaN in the filled_arr. Am I misunderstanding how mask works?
– user29903541
CommentedMar 5 at 21:38
Can you edit your original question and add more complete example code that show where t2m and wind_mag are created/read?
– DopplerShift
CommentedMar 7 at 0:32
Main question edited! Please let me know if there's any additional info I need to give.
– user29903541
CommentedMar 7 at 16:39

Add a comment |

Collectives™ on Stack Overflow

How to use numpy masked arrays to create a masked xarray DataArray?

1 Answer 1

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Related