Skip to content

ArgoFloat handling of mono-cycle profile files#590

Open
gmaze wants to merge 18 commits into
masterfrom
argofloat-open-profile
Open

ArgoFloat handling of mono-cycle profile files#590
gmaze wants to merge 18 commits into
masterfrom
argofloat-open-profile

Conversation

@gmaze

@gmaze gmaze commented Feb 18, 2026

Copy link
Copy Markdown
Member

This PR brings new methods to the ArgoFloat class to list, describe and open one or mono-cycle profile files.

Listing

from argopy import ArgoFloat
af = ArgoFloat(WMO)
af = ArgoFloat(WMO, aux=True)

af.lsp() 
# List of files in the float 'profiles' folder (possibly also include files from the 'aux' folder
# This is the primary source of information to list/find profile files with methods below:

af.describe_profiles()  
# Pandas DataFrame describing all available profile files

af.ls_profiles() 
# Return a dictionary with all available mono-cycle profile files (everything under the 'profiles' sub-folder)
# This is the counter part for profiles of `ls_dataset()` for multi-profile files
  • keys are integer for 'core' and ascending profile files (eg: 12 for '<R/D>6903076_012.nc'),
  • keys are string for all other profile files, with the following convention:
    • 'D' for 'core' descending profile files (eg: '1D' for '<R/D>6903076_001D.nc'),
    • 'B' for BGC ascending profile files (eg: 'B12' for 'B<R/D>6903091_012.nc'),
    • 'BD' for BGC descending profile files (eg: 'B12D' for 'B<R/D>6903091_012D.nc'),
    • 'S' for Synthetic ascending profile files (eg: 'S134' for 'S<R/D>6903091_134.nc').
  • Data from the auxiliary folder have regular keys with aux appended at the end of the key (eg: '11aux' for 'aux/coriolis/2903797/profiles/R2903797_011_aux.nc').

Open/load netcdf files

Opening one file

To load one single file, use keys from af.ls_profiles():

ds = af.open_profile(12) # cycle number 12, core file
ds = af.open_profile('1D') # cycle number 1, descending core file
ds = af.open_profile('B15') # cycle number 15, BGC file
ds = af.open_profile('B1D') # cycle number 1, descending BGC file
ds = af.open_profile('S28') # cycle number 28, BGC synthetic file
ds = af.open_profile('12aux') # cycle number 12, auxiliary file

Opening (and processing) one or more files

To load one or more files, provide cycle number(s) as a list, and possibly other attributes for subsetting:

ds_list = af.open_profiles([1,2,3])
ds_list = af.open_profiles([1,2,3], direction='D')
ds_list = af.open_profiles([1,2,3], dataset='B') # Return 'BGC' B files
ds_list = af.open_profiles([1,2,3], dataset='B', direction='D') # Return 'BGC' B files, descending
ds_list = af.open_profiles([1,2,3], dataset='S') # Return 'BGC' Synthetic files
ds_list = af.open_profiles([1,2,3], auxiliary=True) # Return files from the 'Auxiliary' folder

# If you don't specify cycle numbers, all cycles are loaded:
ds_list = af.open_profiles(direction='D') # Return *all* core descending files

open_profiles() comes with additional arguments like progress and 'preprocess' and 'preprocess_opts' to apply some pre-processing function to each profile file

It is also possible to apply a pre-processing function to each of the file and get this function results instead of the netcdf dataset of profiles:

def ds2dict(ds):
    return {'cyc': ds['CYCLE_NUMBER'].values[0],
            'posqc': ds['POSITION_QC'].values[0],
            'time': ds['JULD'].values[0],
            'lon': ds['LONGITUDE'].values[0],
            'lat': ds['LATITUDE'].values[0],
            'max_pres': ds['PRES'].max().values[np.newaxis][0],
           }
    
data = af.open_profiles(dataset='C', direction='A', progress=True, preprocess=ds2dict);

‼️ This PR does not update the documentation and does not implement unit tests.

- deprec lsprofiles
- update describe_profiles
- new ls_profiles_for
- new ls_profiles
- new open_profile
- fix bug whereby the default ascending direction was not set
- refactor toward: open_profile and open_profiles methods
- more private place holder for perf.
- better docstrings
update facade docstring
- improve typing
- new lsp() as a profiles counter part to ls()
- change column names in describe_profiles
@gmaze gmaze self-assigned this Feb 18, 2026
@gmaze gmaze added the enhancement New feature or request, development label Feb 18, 2026
@gmaze gmaze moved this from Queued to In Progress in Argopy Management Dashboard Feb 18, 2026
@gmaze gmaze added the next-release Strongly needed for the next release label Feb 19, 2026
@gmaze gmaze moved this from In Progress to Stalled in Argopy Management Dashboard Apr 8, 2026
@gmaze gmaze moved this from Stalled to Blocked in Argopy Management Dashboard Jun 18, 2026
@gmaze gmaze requested a review from quai20 June 19, 2026 07:09
gmaze added 4 commits June 19, 2026 13:27
- fix bug whereby argosplitpath could not handle path from the auxiliary folder
- New CYCLE_NUMBERS attribute
- fix bug whereby ls_profiles() would not work for floats with cycle numbers not identical from the euroargofleet API and the netcdf files, this would lead to a wrong N_CYCLES attributes
- N_CYCLES now depends on CYCLE_NUMBERS
- CYCLE_NUMBERS is determined on the fly by looking at the GDAC folder 'profiles' content, not from API or netcdf content
- clean up 'open_profiles'
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request, development next-release Strongly needed for the next release

Projects

Status: Blocked

Development

Successfully merging this pull request may close these issues.

1 participant