Read Snapshot Data


NOTE

From v0.4 and up snapshots are read in proper column-major order, as it should be for Julia. This means that position data for particle i neads be accessed as:

x = pos[1,i]
y = pos[2,i]
z = pos[3,i]

GadgetIO.jl is specialized to read Gadget snapshots of Format 2. The structure of a Format 2 snapshot is as follows:

8              # size of the blockname block (Int32)
BLOCKNAME      # Blockname (4*Char)
8+SIZE_BLOCK   # number of bytes to skip if block should not be read
8              # end of blockname block

SIZE_BLOCK     # size of the current block in bytes
{...}          # content of the block ordered by particle type
SIZE_BLOCK     # end of the current block

which repeats for every block.

Filename

Before we get started, a short bit for clarification: In what follows we need to distinguish between filename and filebase. For small simulations the data is written to a single file. In that case you can simply supply an absolute, or relative path to the one snapshot file as filename.

For larger simulations the data may be distributed over multiple files, which again may be in individual snapshot directories. With the snapshots being distributed over multiple files you need to supply the base-name filebase. Assuming you want to read snapshot 140, which is in the snapshot directory 140 the filebase is

filebase = "path/to/your/snapshot/directories/snapdir_140/snap_140"

In the relevant functions GadgetIO.jl will then automatically loop through the sub-snapshots which end in ".0", ".1", ... , ".N".

Reading a snapshot

There are multiple ways to read the data in the snapshot

Full snapshot

If you want to read a simulation snapshot into memory with GadgetIO.jl, it's as easy as this:

GadgetIO.read_snapFunction
read_snap(filename::String [, blockname::String="", parttype::Integer=-1] )

Wrapper function to read snapshot in various ways: filename only: returns the entire snapshot as a dictionary. blockname: Returns only that block. If parttype specified only for that particle type.

Examples

julia> gas_pos = read_snap(filename, "POS", 0)
source

This will return a dictionary with the header information in data["Header"] and the blocks sorted by particle type.

As an example, this is how you would access the positions of the gas particles:

    data["Parttype0"]["POS"]

Specific blocks

If you only want to read a specific block for a single particle type, e.g. positions of gas particles, you can use the function read_block with a specified blockname and particle type.

GadgetIO.read_blockFunction
read_block(filename::String, blockname::String;
            parttype::Integer=0,
            block_position::Integer=-1,
            info::Union{Nothing,InfoLine}=nothing,
            h::Union{Nothing,SnapshotHeader}=nothing,
            offset=0, n_to_read=-1)

Reads a block in a snapshot with given name. Defaults to reading gas particles. Block Names are case sensitive.

Keyword Arguments

  • parttype: Which particle type to read (0-indexed)
    • 0: Gas (default)
    • 1: Dark Matter
    • 2: Boundary
    • 3: Bulge
    • 4: Stars
    • 5: Black Holes
    • -1: All particle types
  • block_position: Position of the block in the data file [bytes]. Can be used to speed up IO.
  • info: InfoLine for the given block. Can be used to speed up IO, or if no INFO block is present.
  • h: SnapshotHeader for given file. Can be used to speed up IO.
  • offset: Adds an offset to start reading the block at a later point. Can be used for sub-IO.
  • n_to_read: How many particles to read. Can be used for sub-IO.

Examples

In case an INFO block is present:

gas_pos = read_block(filename, "POS", parttype=0)

If not you need to supply your own InfoLine

pos_info = InfoLine("POS", Float32, 1, [1, 1, 1, 1, 1, 1])
gas_pos = read_block(filename, "POS", info=pos_info, parttype=0)
source

This will return an array of the datatype of your simulation, usually Float32.

If the snapshot has no info block this will fail unfortunately.

You can still read the specific block by supplying a hand-constructed InfoLine struct and passing that to the function read_block:

pos_info = InfoLine("POS", Float32, 3, [1,1,1,1,1,1])
pos      = read_block(filename, "POS", info=pos_info, parttype=0)

I will collect some example InfoLine structs in a later release to be able to read some common blocks even without an INFO block.

Since v0.5 read_snap and read_block also work if you pass them a file_base.

Since v0.7 read_block reads the full block if parttype=-1 is set.

Read Subvolumes

If you only want to read a subvolume of the whole simulation you can do this in two ways.

Cubes

To get all particles within a cubic box of the simulation you can use the functions read_particles_in_box or read_particles_in_volume.

read_particles_in_box takes a box defined by a lower-left corner and an upper-right corner and reads all requested blocks and particles in that volume.

GadgetIO.read_particles_in_boxFunction
read_particles_in_box(filename::String, blocks::Vector{String},
                    corner_lowerleft::Array{<:Real}, corner_upperright::Array{<:Real};
                    parttype::Integer=0, verbose::Bool=true,
                    use_keys::Bool=true)

Reads all particles within a box defined by a lower left and upper right corner for a given particle type. Returns a dictionary with all requested blocks. If use_keys=true it uses Peano-Hilbert keys to constrain the read-in, otherwise it uses a brute-force read-in with a filter function. Peano-Hilbert key based read-in is significantly faster.

source
read_particles_in_box(filename::String, block::String,
                      corner_lowerleft::Array{<:Real}, corner_upperright::Array{<:Real};
                      parttype::Integer=0, verbose::Bool=true,
                      use_keys::Bool=true)

Like read_particles_in_box but for a single block. Returns the block as an array.

See also: read_particles_in_box

source

You can define an array of blocks you want to read, these will be read into a dictionary.

read_particles_in_volume is a simple wrapper around read_particles_in_box, where you can define a central position and a radius around it and it will construct the box containing that sphere for you and read all particles in it.

GadgetIO.read_particles_in_volumeFunction
read_particles_in_box(filename::String, blocks::Vector{String},
                      center_pos, radius;
                      parttype::Integer=0, verbose::Bool=true,
                      use_keys::Bool=true)

Reads all particles within a box encapsulating a volume defined by center position and radius for a given particle type. Returns a dictionary with all requested blocks.

See also: read_particles_in_box

source
read_particles_in_box(filename::String, block::String,
                      center_pos, radius;
                      parttype::Integer=0, verbose::Bool=true)

Like read_particles_in_volume but for a single block. Returns the block as an array.

See also: read_particles_in_volume

source

Arbitrary Geometries

For reading particles in more complex geometries you can use

GadgetIO.read_particles_in_geometryFunction
read_particles_in_geometry( filename::String, blocks::Vector{String},
                            geometry::AbstractGadgetGeometry;
                            parttype::Integer=0, verbose::Bool=true,
                            use_keys::Bool=true, do_shift_across_box_border::Bool=true)

Reads all particles within a space defined by an AbstractGeometry struct for a given particle type. Returns a dictionary with all requested blocks. If shift_across_box_border is true, the particles are moved beyond the borders of a periodic box, if false, the periodicity of the box is still accounted for, but the particles' positions are not shifted.

source
read_particles_in_geometry( filename::String, block::String,
                            geometry::AbstractGadgetGeometry;
                            parttype::Integer=0, verbose::Bool=true,
                            use_keys::Bool=true, do_shift_across_box_border::Bool=true)

Reads all particles within a space defined by an AbstractGeometry struct for a given particle type. Returns a dictionary with the requested block.

source

You can use built-in geometries like GadgetCube, GadgetSphere and GadgetCylinder.

If you want to extend the functionality you can define your own geometry as

struct YourGeometry{T} <: AbstractGadgetGeometry
    prop1::Vector{T}
    prop2::Vector{T}
    prop3::T
    (...)
end

and define the functions GadgetIO.get_geometry_box_corners and GadgetIO.get_geometry_mask. GadgetIO.get_geometry_box_corners has to return a Tuple of two vectors which define the lower left and upper right corner of a box that contains the geometry. GadgetIO.get_geometry_mask has to return an array of indices for which pos is contained in the geometry.

In all functions parttype defines the particle type to be read, as in the previous read functions and verbose gives console output. There are also multiple dispatch versions of all functions available that only take a single block as input and return an array with the values instead of a dictionary.

Peano-Hilbert key based reading

For large simulations Gadget distributes snapshots over multiple files. These files contain particles associated with specific Peano-Hilbert keys.

If you call read_particles_in_box or read_particles_in_volume with the keyword argument use_keys=true (which is the default case) it constructs the peano hilbert keys, selects the relevant files and reads the particles from these files into a dictionary. This is considerably faster than the brute-force attempt.

Brute-Force Reading

If you call read_particles_in_box or read_particles_in_volume with the keyword argument use_keys=false it reads all particles over all distributed files which are contained in the requested subvolume. This takes quite a lot longer than the key based reading, but sometimes it's the only option. To speed this up you can apply the filtering only once and store the Read positions.

Custom Filtering

If you want to read multiple blocks in a simulation whose snapshots have been distributed over a number of sub-snapshots you can use read_blocks_filtered.

GadgetIO.read_blocks_filteredFunction
read_blocks_filtered( snap_base::String, blocks::Array{String};
                            filter_function::Union{Function, Nothing}=nothing, 
                            read_positions::Union{Dict, Nothing}=nothing, 
                            parttype::Integer=0, verbose::Bool=true )

Reads the specified blocks from all distributed files where particles pass the filter_function, or are given by a Dict of read_positions. For read_positions please see find_read_positions.

source

This will read the specified blocks for all particles that pass the filter_function. This can be useful if you don't know where the region you are interested in is located and don't have enough memory to read in all particles. Alternatively you can also provide a Dict with read_positions as described in Read positions.

Filter functions

The filter_function can be any function that takes a String input and returns an Array of Integers, or CartesianCoordinates. For example, if you want to filter all particles with a Mach number larger than 1:

function mach_gt_1(snap_file)
    mach = read_snap(snap_file, "MACH", 0)
    sel  = findall( mach .> 1.0 )
    return sel
end

Or if you want to trick the function into reading all particles after all:

function pass_all(snap_file)
    h = read_header(snap_file)
    return collect(1:h.npart[1])
end

As an example, to read positions, velocity and ID of all shocked particles from distributed snapshots use

blocks = ["POS", "VEL", "ID"]
data = read_blocks_filtered(snap_base, blocks, filter_function=mach_gt_1, parttype=0)

Just as a reminder from above you can read single blocks into an array by using read_snap and read_block:

pos = read_block(snap_base, "POS", parttype=0)

This requiers GadgetIO.jl v0.5 though.

Read positions

To avoid having to filter all files each time you want to read a snapshot you can also split the steps. You can first filter the particles to find the positions of the particles within the data blocks with

GadgetIO.find_read_positionsFunction
find_read_positions( snap_base::String, filter_function::Function)

Finds the number of particles and their storage location in a snapshot directory that pass the filter_function.

source
find_read_positions( snap_base::String, geometry::AbstractGadgetGeometry;
                     parttype::Integer=0,
                     verbose::Bool=true )

Finds the number of particles and their storage location in a snapshot directory that are contained in the space defined by geometry.

source

and then save the result as a binary file with

To re-use the read_positions you can load them from file using load_read_positions

read_positions = load_read_positions(save_file)

This can then be used to read any number of blocks with

blocks = ["POS", "VEL", "ID"]
data = read_blocks_filtered(snap_base, blocks, read_positions=read_positions, parttype=0)

Reading particles by referenced ID

If you want to select specific particles to read from an array of IDs you can do this with

GadgetIO.read_particles_by_idFunction
read_particles_by_id(snap_base::String, ids::Array{<:Integer}, 
                     blocks::Array{String}; 
                     parttype::Integer=0, verbose::Bool=true,
                     pos0::Array{<:Real}=[-1.234, -1.234, -1.234],
                     r0::Real=0.0)

Reads particles filtered by the provided IDs. Returns all requested blocks as entries in a Dict.

source
read_particles_by_id(snap_base::String, ids::Array{<:Integer}, 
                     block::String; 
                     parttype::Integer=0, verbose::Bool=true,
                     pos0::Array{<:Real}=[-1.234, -1.234, -1.234],
                     r0::Real=0.0)

Reads particles filtered by the provided IDs. Returns the requested block as an Array.

source

snap_base defines the target snapshot, or the snapshot basename, selected_ids contains the list of IDs of the particles you want to read and blocks containes the blocknames of the blocks you want to read. If the simulation is too large to read the whole snapshot into memory you can give values for pos0 and r0 to read only a specific region with read_particles_in_volume. See Read Subvolumes for details on this.

This will return a dictionary with all requested blocks.

Example

If you want to, e.g. read positions, velocities, masses, density and hsml for all gas particles within the virial radius of the most massive halo of a simulation you can do this as follows.

Assuming pos_halo is the position of the center of mass of the halo and r_vir is its virial radius you read the data with

blocks = ["POS", "VEL", "MASS", "RHO", "HSML"]

data   = read_particles_in_volume(filename, blocks, pos_halo, r_vir,
                                  parttype=0,
                                  verbose=true)

This will return a dictionary with the blocks as keys and containing the arrays for the particles.

data["POS"]  # array of positions
data["RHO"]  # array of densities
(...)