The Impact of Cosmic Variance on Inferences of Global
Neutral Fraction Derived from Lyman-Alpha Luminosity

Functions

A THESIS

SUBMITTED TO THE FACULTY OF THE GRADUATE SCHOOL

OF THE UNIVERSITY OF MINNESOTA

BY

Sean Tyler Bruton

IN PARTIAL FULFILLMENT OF THE REQUIREMENTS

FOR THE DEGREE OF

MASTER OF SCIENCE

M. Claudia Scarlata

November, 2023


© Sean Tyler Bruton 2023

ALL RIGHTS RESERVED


Abstract

We investigate the impact of field-to-field variation, deriving from cosmic variance, in

measured Lyα emitter (LAE) luminosity functions (LFs) and this variation’s impact on

inferences of the neutral fraction of the intergalactic medium (IGM) during reionization.

We post-process a z = 7 IGM simulation to populate the dark matter halos with LAEs.

These LAEs have realistic UV magnitudes, Lyα fluxes, and Lyα line profiles. We calculate

the attenuation of Lyα emission in universes with varying IGM neutral fraction, x̄HI. In a

x̄HI = 0.3 simulation, we perform 100 realizations of a mock two deg2 survey with a redshift

window ∆z = 0.5 and flux limit fLyα > 1 × 10−17erg s−1 cm−2; such a survey is typical

in depth and volume of the largest LAE surveys conducted today. For each realization,

we compute the LAE LF and use it to recover the input x̄HI. Comparing the inferred

values of x̄HI across the ensemble of the surveys, we find that cosmic variance, deriving

from large-scale structure and variation in the neutral gas along the sightline, imposes a

floor in the uncertainty of ∆x̄HI ∼ 0.2 when x̄HI = 0.3. We explore mitigation strategies

to decrease this uncertainty, such as increasing the volume, decreasing the flux limit, or

probing the volume with many independent fields. Increasing the area and/or depth of the

survey does not mitigate the uncertainty, but composing a survey with many independent

fields is effective. This finding highlights the best strategy for LAE surveys aiming at

constraining the x̄HI of the universe during reionization.

i


Contents

Abstract i

List of Tables iii

List of Figures iv

1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

2 The Reionization Simulation and its Post-Processing . . . . . . . . . . . . . 4

2.1 The 21cmFAST Simulation . . . . . . . . . . . . . . . . . . . . . . . 4

2.2 Propagating Halo Growth . . . . . . . . . . . . . . . . . . . . . . . . 5

2.3 Galaxy Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

2.4 Validation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

2.5 Full Volume z=7 Lyα LFs . . . . . . . . . . . . . . . . . . . . . . . . 13

3 Cosmic Variance in LAEs . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14

3.1 A Nominal LAE Survey . . . . . . . . . . . . . . . . . . . . . . . . . 15

3.2 Inference on x̄HI from the Nominal Survey . . . . . . . . . . . . . . . 17

3.3 The Effect of Cosmic Variance on x̄HI Inference . . . . . . . . . . . . 19

3.4 Mitigating the Systematics . . . . . . . . . . . . . . . . . . . . . . . 21

4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29

References 31

ii


List of Tables

1 Beta Parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17

iii


List of Figures

1 Three z = 6 UV LFs, resulting from three different Mh-MUV scatter values,

are plotted in solid green (σMUV
= 0.1), blue (σMUV

= 0.3), and orange

(σMUV
= 0.5). We calculate the UV LF for each σMUV

value across 50

different realizations of the simulation and show the region within which 95%

of UV LFs fall with the shaded areas. The same color scheme is used for the

z=7 UV LFs, which are plotted as dotted lines. The z=6 data are plotted as

white markers with black borders; the squares are Bouwens et al. (2021) and

the circles are Ono et al. (2018). The z=7 data are shown in gray; the squares

are Finkelstein et al. (2015), stars are Atek et al. (2015), circles are Bouwens

et al. (2011), pentagons are Bouwens et al. (2015b), crosses are Bowler et al.

(2014), diamonds are Castellano et al. (2010), down triangles are Livermore

et al. (2017), up triangles are McLure et al. (2013), left pointing triangles

are Ouchi et al. (2009), right pointing triangles are Schenker et al. (2013),

and tri-down are Tilvi et al. (2013). The Mh-MUV relation with scatter of

σMUV
= 0.3 reproduces the z=6 and z=7 data simultaneously. The bright

end provides the demarcation between models with different σMUV
values. 7

2 The log 2D distribution of galaxies at z=6.6 in the MUV − EWLyα plane.

The colored lines denoted in the legend indicate traces of constant limiting

Lyα line flux in ergs s−1 cm−2. A survey with a given line flux limit could

detect LAEs which lie above the trace corresponding to its flux limit. Note

that there is an abundance of galaxies at EWLyα=0 along the very bottom

of the distribution, corresponding to the non-Lyα emitting population. . . 9

iv


3 We show the 1st moment of the Lyα lines versus the log of the Lyα luminosity

for the MUSE data at z < 3.5 (green) and z > 5.5 (red). Our linear fit is

shown with the blue line, while the black hashed lines show the 1−σ Gaussian

scatter about the relation that we fit with emcee. . . . . . . . . . . . . . . 12

4 The z = 5.7 Lyα LF is shown in blue with constraints from Konno et al.

(2018) and Ouchi et al. (2010). We see very good agreement between our

simulation and the observational constraints. . . . . . . . . . . . . . . . . . 13

5 The curves show the LAE LF after IGM attenuation for various global neu-

tral fractions, x̄HI, at z = 7. The shaded regions show the range in which

95% of 30 simulation realizations reside. Observational data from Ota et al.

(2017); Itoh et al. (2018); Hu et al. (2019) are shown with various markers.

The data have been offset by 0.01 with respect to each other on the abscissa

to aid with visualization. The observational data at z = 7 is consistent with

a completely or mostly ionized IGM, as demonstrated with the data’s agree-

ment with the blue and orange curves. This, combined with the z = 5.7 Lyα

LF agreement, shows that the evolution of the z = 5.7 to z = 7 Lyα LF is

explainable with smaller halos and lower intrinsic Lyα emission, rather than

an increase in global neutral fraction in our simulation. . . . . . . . . . . . 14

6 The black points show a LAE LF measured from a 2 square degree survey

with perfect observations to a depth of fLyα > 1× 10−17 [ergs s−1 cm−2] and

a redshift window of 6.75 < z < 7.25. The model LFs for x̄HI = [0.01, 0.3,

0.5] are shown as the blue, black, and green curves, respectively. The error

bars are assumed to be Poissonian for N > 20, otherwise they are tabulated

from Gehrels (1986). The intrinsic scatter of the black points about the 0.3

model LF, the input neutral fraction, comes from cosmic variance. . . . . . 15

v


7 The posterior on x̄HI inferred from the LAE LF measured in Figure 6 is

shown in blue. The simulation’s input x̄HI, 0.3, is indicated with the vertical

dashed black line. The posterior favors a slightly lower x̄HI than the true

value because the measured region of space happened to have a slightly over-

dense number of LAEs. The large width of posterior indicates that x̄HI is

not very well constrained; 95% of the posterior is contained in the range

0.01 < x̄HI < 0.45. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18

8 The left panel shows the distribution of the number of LAEs observed, NLAE,

in 10,000 realizations of the nominal 2 square degree survey in a x̄HI=0.3 at

z = 7 universe in blue. We investigate one over-dense region (orange vertical

line) and one under-dense region (green vertical line) further. A Poisson dis-

tribution with the same sample mean is shown in black–the excess variance

over the Poisson distribution is due to the clumpy nature of the large-scale-

structure and the rarity of LAEs. The right panel shows the posteriors on

x̄HI for the extrema of the 95th percentile range of LAE densities in orange

(high LAE density) and green (low LAE density). The simulation’s input

x̄HI is shown with the vertical black dashed line. The difference between the

medians of these distributions is ∆ ∼ 0.19. . . . . . . . . . . . . . . . . . . 20

9 We increase the area of the nominal surveys on the previously considered

density extrema locations while keeping the luminosity limit and redshift

window constant. The posteriors in the low NLAE region flatten out at a

non-zero value because of a single bright LAE which happened to fall within

the survey. Increasing the area to 20 square degrees centers moves the peak

of the posterior towards the true value of x̄HI, so we will further investigate

20 square degree surveys to determine the intrinsic spread in inferred x̄HI

from cosmic variance. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22

vi


10 The distribution of 100 posterior medians for the 20 square degree survey

(blue) and nominal 2 square degree survey (black) for MCMC analyses. Both

of these surveys have ∆z = 0.5 and luminosity limit LLyα > 6×1042[ergss−1].

The area of the 20 square degree survey was chosen because it seemed to

remove the offset in inferred x̄HI with respect to the true value seen in Figure

9. The scatter in the posteriors medians is consistent between the two survey

strategies, indicating that even increasing the area of the survey by a factor

of 10 is not enough to overcome the fact that the uncertainty on x̄HI is

dominated by cosmic variance and imposes an uncertainty ∆x̄HI ∼ 0.2. . . 23

11 The distribution of 100 posterior medians for the deep (LLyα > 1.8 ×
1042[ergs s−1]) 10 square degree surveys (blue) and the posterior medians

of the 100 nominal surveys (black). For both survey strategies ∆z = 0.5.

The widths of the distributions of both survey strategies is about the same;

95% of the nominal survey’s medians are within a range ∆x̄HI= 0.25, while

the deep 10 square degree medians have 95% of their realizations within

∆x̄HI= 0.23. This indicates that pushing to deeper flux limits is not a viable

strategy in mitigating the effect of cosmic variance. . . . . . . . . . . . . . 25

vii


12 The left panel shows the distribution of the number of LAEs per square

degree observed by 1000 surveys of areas 2 and 20 square degrees (and to

depth LLyα ≳ 6×1042ergss−1) in blue and red, respectively. Surveys for each

area, centered on the under-dense region identified in the nominal 2 square

degree survey, are indicated with the dashed vertical lines of the same colors.

The distributions of NLAE per square degree tend to narrow as survey area

increases. The right panel shows the distribution of the NLAE observed in

1000 realizations of the 20 square degree. The underlying green distribution

shows the distribution of NLAE in 300 surveys of 100 0.2 square degree fields,

totalling 20 square degrees. This survey strategy will be discussed more in

Section 3.4; note that the distribution of NLAE is narrower for this strategy.

The dashed red vertical line again indicates the under-dense region selected

with the 2 square degree survey, and the distribution’s 2.5 percentile is

indicated with the solid black vertical line. The imprint of the under-dense

region identified with a 2 square degree survey is still dominating the 20

square degree survey’s total number of LAEs compared to other 20 square

degree surveys, illustrated by the fact that the red dashed line does not move

much towards the distribution’s mean. . . . . . . . . . . . . . . . . . . . . 27

13 The distribution of posterior medians for 90 surveys composed of 100 0.2

square degree fields to a depth of LLyα ≳ 1.8×1042ergs s−1 is shown in blue.

The distribution of the 100 nominal survey’s medians is shown in black. 95%

of the nominal survey’s medians are within a range ∆x̄HI= 0.25, while the 90

surveys composed of 100 independent fields have 95% of their realizations

within ∆x̄HI= 0.05. This strategy is effective in reducing the systematic

uncertainty on x̄HI. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28

viii


1

1 Introduction

Reionization is the most recent phase change of the universe, in which the hydrogen in the

intergalactic medium (IGM) went from being virtually fully neutral to nearly completely

ionized. It is thought to have been complete within about the first billion years of the

universe. It is expected to have been a patchy process, as ionized bubbles formed and

grew around the first sources of ionizing radiation in the universe. Generally, it is thought

that stars in the first galaxies provided the bulk of the necessary ionizing radiation to drive

reionization (Finkelstein et al., 2015; Robertson et al., 2015; Bouwens et al., 2015a), though

which galaxies dominate the ionizing photon budget is still being investigated.

Over the past several years, observations have been made which put constraints on

the timeline of reionization using complementary methods sensitive to ionized and neutral

hydrogen. Some examples: Planck-Collaboration et al. (2020) use the scattering signature

of electrons freed during reionization on the cosmic microwave background to constrain

the midpoint on instantaneous reionization to redshift zre = 7.67 ± 0.73. The fraction of

dark pixels in the Lyα and Lyβ forests constrains reionization to be very nearly complete

at z ∼ 6 (Becker et al., 2021; Qin et al., 2021; Bosman et al., 2022). Quasar spectra at

z > 7 offer insight into the global neutral fraction (x̄HI) of the IGM throughout reionization

based on the IGM’s Lyα damping wing imprints (Greig et al., 2019; Davies et al., 2018).

Galaxies, and in particular Lyα-emitters (LAEs) are also used to constrain the IGM

hydrogen neutral fraction, and its evolution with time. Lyα is a resonant transition in

hydrogen and Lyα photons scatter multiple times even in small amounts of neutral gas

(Dijkstra, 2014, 2017). Thus, observations of Lyα are particularly useful to probe the

H i spatial distribution, column density and velocity fields. Observations of Lyα freely

propagating through the IGM is indicative of a largely ionized IGM or a significantly

redshifted Lyα line.

The LAE fraction, the fraction of galaxies which are found to emit Lyα, has been

found to rapidly decline at z > 6 (Stark et al., 2010; Pentericci et al., 2011; Jung et al.,

2018), and this decline is interpreted as a sign of increasing neutral hydrogen fraction

in the IGM (Treu et al., 2013; Mesinger et al., 2015; Mason et al., 2018a). There are

complications, however, in that the observed decreasing LAE fraction could also result

from evolving galaxy properties, such as the escape fraction of Lyα photons (Dijkstra,


2

2014), an increase in the number of Lyman Limit Systems (LLS) (Bolton & Haehnelt,

2013), or other considerations (see Finkelstein (2016) for a review). Still, Mesinger et al.

(2015) and Mason et al. (2018a) argue that the increasing neutral fraction of the IGM

is the most plausible explanation, and in this context the rapid disappearance of Lyα at

z > 6 paints a picture of late and rapid reionization (Ouchi et al., 2018; Hoag et al., 2019;

Mason et al., 2019; Yoshioka et al., 2022). Such a timeline for reionization may favor UV

bright, massive galaxies being the drivers of reionization (Naidu et al., 2020), as opposed

to an undetected large population of faint objects (Finkelstein et al., 2019).

Many studies have also used the evolving luminosity function of LAEs to place con-

straints on reionization (Rhoads & Malhotra, 2001; Malhotra & Rhoads, 2004, 2006; Hu

et al., 2019; Wold et al., 2021; Morales et al., 2021). However, the constraints on x̄HI at

z = 7 from LAE luminosity functions (LFs) are sometimes in slight tension. At z = 7,

Zheng et al. (2017) find x̄HI∼ 0.40−0.60, Ota et al. (2017) find x̄HI> 0.4, Itoh et al. (2018)

find x̄HI = 0.25+0.25
−0.25, Hu et al. (2019) find x̄HI ∼ 0.2 − 0.4, and Wold et al. (2021) find

x̄HI < 0.33. More recently, Morales et al. (2021) compared observed Lyα LFs with models

from an inhomogeneous reionization simulation to constrain x̄HI∼ 0.08+0.08
−0.05, 0.28 ± 0.05,

and 0.69± 0.11 at redshifts 6.6, 7.0, and 7.3, respectively.

The moderate tension between these inferences on x̄HI can perhaps be attributed to

cosmic variance. There is stochasticity in the number of LAEs observed in a given survey

arising from both large scale structure and the inhomogeneity of reionization.

It is noteworthy that LAEs at z > 6 are fairly rare objects–dedicated narrowband

surveys at z ∼ 7 typically detect tens of objects in volumes ∼ 1 × 106 cMpc (Ota et al.,

2017; Itoh et al., 2018; Hu et al., 2019; Wold et al., 2021); these small number statistics

could easily lead to field-to-field variations in the measurement of the z = 7 LAE LF. This

effect has been observed in features like the LAE LF’s “bright end bump” (Zheng et al.,

2017; Hu et al., 2019; Wold et al., 2021), the result of very bright LAEs falling within the

field of observation. The “bright end bump” is not observed in all LAE surveys at z = 7

(or even between different fields of the same survey), presumably either because of these

objects inherent rarity or their concealment behind a neutral IGM.

This is the crux of the issue we will explore in this paper: there is a degeneracy

between the stochasticity in the number of LAEs observed and the inferred global neutral


3

fraction of the universe. Indeed, Mesinger & Furlanetto (2008a) showed that, owing to

the inhomogeneity of reionization, there is intrinsic scatter in the inferred global neutral

fraction of the universe from observations of the Lyα damping wing in high-z quasars or

gamma-ray bursts (GRBs). McQuinn et al. (2008) similarly found that a single high-

z GRB could not place a constraint on the global neutral fraction with an uncertainty

better than ∆x̄HI ∼ 0.3. Mason et al. (2018b) demonstrated the same principle with

regards to Lyα equivalent widths; inference on the global neutral fraction is inherently

stochastic when using observations of the equivalent widths (EWs) of Lyα in UV bright

galaxies. Further, Mesinger & Furlanetto (2008a), McQuinn et al. (2008), and Mason

et al. (2018b) all showed that the uncertainty on x̄HI is a function of x̄HI itself, ∆x̄HI(x̄HI).

Generally, the uncertainty tends to decrease as x̄HI increases because the size distribution

of ionized bubbles gets smaller, since they have not yet had time to grow and merge in

early reionization (Mason et al., 2018b). We will demonstrate that there is a similar effect,

a floor in the uncertainty on inferring x̄HI, when using the evolving Lya LF and explore

mitigation strategies to reduce this uncertainty.

We post-process a cosmological inhomogeneous reionization simulation (Mesinger &

Furlanetto, 2007; Mesinger et al., 2011, 2016), populating the dark matter halos with

simulated galaxies from empirical relations. We first verify our simulation against the

intrinsic UV and Lyα luminosity functions (LFs) at z ∼ 6, then model the Lyα LFs after

IGM attenuation at z ∼ 7. We perform mock observations on our simulation, reproducing

the probed volumes of surveys which are carried out today. Across each of the many

realizations of these mock observational programs, we measure the LAE LF and infer x̄HI.

We investigate how the inferred z = 7 value of x̄HI changes as a result of statistical variance

in the observed LAE LFs.

In §2 we give an overview of the simulation and the post processing we have performed.

In §3 we explore the inherent uncertainty on inferred global neutral fraction at z ∼ 7

deriving from cosmic variance. We also explore if this uncertainty decreases by significantly

increasing the survey area, depth, or using many independent fields. We conclude in §4.

Throughout the paper we use the AB magnitude system and the Planck-Collaboration

et al. (2016) cosmology with (ΩΛ, Ωm, H0) = (0.69, 0.31, 68 km s−1Mpc−1),


4

2 The Reionization Simulation and its Post-Processing

In this section, we first summarize the simulation of dark matter halo masses, positions,

and their Lyα optical depths. We then present an overview of our methods for assigning

galaxy properties to the halos.

2.1 The 21cmFAST Simulation

We use nine custom simulations produced with the 21cmFASTv2 software (Mesinger &

Furlanetto, 2007; Mesinger et al., 2011, 2016). 21cmFASTv2 calculates the evolution of

the hydrogen neutral fraction in the early universe using a semi-numerical approach. Inside

the simulation cube, 1.6 cGpc on each side with a 10243 cell resolution, the code tracks

dark matter and the phase of hydrogen in each cell while accounting for recombinations,

photoheating star-formation suppression, supernova feedback, and radiation. Dark matter

halos are identified from the 30723 initial conditions, and mapped to Eulerian positions at

z=7 using perturbation theory (Mesinger & Furlanetto, 2007). More detailed information

can be found in Mesinger et al. (2016).

Each simulation contains the comoving Cartesian coordinate positions of halos with

masses ranging from 1010.25 − 1012M⊙. The halos start with only 14 discrete halo masses,

evenly spread in log space. To avoid effects resulting from this discretized mass function,

we redistribute the masses to produce a smooth distribution in log space, preserving each

halos positions while approximating the original mass. The simulation also computes Lyα

optical depths as a function of velocity offset from Lyα line center, τ(∆v), up to 525kms−1

(∆λ = 2.1 Å) in steps of 75km s−1 (∆λ ∼ 0.3 Å). These optical depths are calculated by

integrating the neutral hydrogen along a 300 comoving Mpc path; 300 cMpc was chosen

experimentally, as it ensures convergence of the optical depth (Mesinger & Furlanetto,

2008b). Intrinsically, the optical depths are smooth functions of velocity offset from the

line center, but we only have coarsely sampled optical depths at particular velocity offsets.

To mitigate any effect from this coarse sampling, we interpolate the optical depths across

velocity offsets when calculating the IGM attenuation on the Lyα line profile.

In total, we have nine z = 7 21cmFASTv2 simulations with different global neutral

fractions, x̄HI, in the range 0.01 < x̄HI < 0.92. Each simulation takes the same z = 7 dark


5

matter halo distribution and overlays on a different IGM ionization map, which corresponds

to changing the ionization efficiency (McQuinn et al., 2007; Mason et al., 2018a). To

get an intuitive picture, one may imagine that in a scenario with large x̄HI (most IGM

hydrogen neutral), any given halo will have a small ionized bubble around it, if any. In a

low x̄HI scenario (most IGM hydrogen ionized), those same halos will have larger ionized

bubbles around them, which may have begun to overlap with neighboring halos’ bubbles.

This means that each individual halo’s optical depth, at a given velocity offset, smoothly

increases as neutral fraction increases, which allows us to interpolate between the neutral

fraction files, obtaining a finer grid in neutral fraction than our original nine simulations.

2.2 Propagating Halo Growth

In order to make predictions at z = 7, we need to calibrate the Lyα LF using observations

at lower redshifts (z ∼ 6), where there is evidence that reionization is very nearly complete

(Qin et al., 2021). Though reionization may not be complete at z ∼ 6, residual x̄HI (i.e. if

x̄HI∼ 0.1) has a very small effect on the optical depth of Lyα and so the LAE LF will not

be significantly changed (Mason et al., 2018a). As redshift decreases, halos grow because

they accrete matter through mergers. Thus, we need to forward propagate the halo mass

distribution from z = 7 to lower redshifts for proper comparison to observations.

We use the median halo growth trajectories from Behroozi et al. (2019) to grow our

halos. We interpolate between the trajectories to create a fine grid in halo mass, then

assign each of the z = 7 halo masses to a particular growth curve. To assign a growth

trajectory to a halo mass, we find the curve which has z = 7 halo mass closest to the given

halo mass in the simulation. Then, moving along the matched growth curves to the lower

redshift of interest, we get the new masses. These new masses are then redistributed into

a smooth log-space distribution.

2.3 Galaxy Properties

To achieve our goal of simulating LAEs, we need to assign intrinsic Lyα luminosities and

Lyα line properties to the dark matter halos. Our road map will be:

1. Assign absolute UV magnitudes (MUV) using a Mh-MUV relation (§2.3)


6

2. Assign a Lyα EW (EWLyα) to each halo, using a UV dependent EW probability

distribution function (PDF) (§2.3)

3. Assign Lyα line profiles, with line velocity offset and width dependent on the halo

masses (§2.3)

Item 3 is required to compute the opacity to Lyα photons from the neutral IGM. In

what follows, we first describe the methodology used to assign MUV and Lyα EW to each

halo and how this methodology was validated with real data. We then discuss the new

prescription to assign the Lyα line profile and the comparison with observations.

The Mh-MUV Relation

We use the Mh-MUV relation from Mason et al. (2015) to assign UV magnitudes to our

halos. This relation has an in-built redshift dependence, since halo distributions and their

growth rates change with redshift (Mason et al., 2015). We use relations tabulated at z = 6

and z = 6.8 in this paper. Considering our small range of redshifts of interest (5.7 < z < 7)

and the relatively small evolution between the z = 6 and z = 6.8 Mh-MUV relation, we

do not derive a new Mh-MUV relation for every redshift we consider. Instead, we apply

the z = 6 relation if our simulation redshift is 5.7 < z < 6.4 and z = 6.8 relation if our

simulation redshift is 6.4 < z < 7.

The relations provided in Mason et al. (2015) do not include a prescription for the

scatter, so we follow methods akin to Ren et al. (2019) and Whitler et al. (2020) and

assume a normal scatter in MUV. The specific value of the scatter is chosen so that the

resulting UV LF reproduces the measured LFs at z ∼ 6 and z ∼ 7 simultaneously. In

Figure 1, we show UV LF observations at z=6 in black and z=7 in gray. At both redshifts,

we plot three UV LFs resulting from different scatter prescriptions: 0.1 in green, 0.3 in

blue, 0.5 in orange. Solid lines are used for z=6, and dotted lines for z=7. We apply each

relation with its scatter 50 times and indicate the range within which 95% of the UV LFs

fall with the shaded regions.

Motivated by the agreement between the UV LFs with σMUV
= 0.3 and the observations

at z = 6 and z = 7, we fix the value of σMUV
to 0.3. The fact that we are able to reproduce

UV LFs at two redshifts simultaneously with only one varying parameter is encouraging;

it validates both our halo growth prescription and the Mh-MUV relation.


7

23.0 22.5 22.0 21.5 21.0 20.5 20.0 19.5 19.0
MUV

10 7

10 6

10 5

10 4

10 3

10 2

M
[c

M
pc

3 m
ag

1 ]

MUV = 0.3
MUV = 0.5
MUV = 0.1

z=6 Simulations
z=7 Simulations

z=6 Observations
z=7 Observations

Figure 1: Three z = 6 UV LFs, resulting from three different Mh-MUV scatter values, are
plotted in solid green (σMUV

= 0.1), blue (σMUV
= 0.3), and orange (σMUV

= 0.5). We
calculate the UV LF for each σMUV

value across 50 different realizations of the simulation
and show the region within which 95% of UV LFs fall with the shaded areas. The same color
scheme is used for the z=7 UV LFs, which are plotted as dotted lines. The z=6 data are
plotted as white markers with black borders; the squares are Bouwens et al. (2021) and the
circles are Ono et al. (2018). The z=7 data are shown in gray; the squares are Finkelstein
et al. (2015), stars are Atek et al. (2015), circles are Bouwens et al. (2011), pentagons
are Bouwens et al. (2015b), crosses are Bowler et al. (2014), diamonds are Castellano
et al. (2010), down triangles are Livermore et al. (2017), up triangles are McLure et al.
(2013), left pointing triangles are Ouchi et al. (2009), right pointing triangles are Schenker
et al. (2013), and tri-down are Tilvi et al. (2013). The Mh-MUV relation with scatter of
σMUV

= 0.3 reproduces the z=6 and z=7 data simultaneously. The bright end provides
the demarcation between models with different σMUV

values.


8

MUV-EWLyα Distribution

The distribution of rest-frame EWLyα given MUV takes the form

p(EWLyα | MUV) =
A(MUV)

Wc(MUV)
e
−

EWLyα
Wc(MUV)H(EWLyα)

+ [1−A(MUV)]δ(EWLyα)

(1)

from Mason et al. (2018a). H(EWLyα) is the Heaviside step function and δ(EWLyα) is the

Dirac delta function. Mason et al. (2018a) fit the parameters A and Wc using the De Bar-

ros et al. (2017) measurements of MUV and EWLyα. The population from De Barros et al.

(2017) were photometrically selected using the Lyman break, so called Lyman break galax-

ies (LBGs), and it is worth noting that not all LBGs show Lyα emission. The parameter

A accounts for the fraction of galaxies which do not emit Lyα and Wc determines the

exponential decline of the probability distribution function towards larger EWLyα. Both

parameters are allowed to vary as a function of MUV, and we take their value from Mason

et al. (2018a). To give a sense of the resulting distribution, the log 2D distribution of the

MUV-EWLyα plane for z = 6.6 galaxies, along with traces for constant Lyα flux limits, are

shown in Figure 2. UV faint galaxies have a larger probability of taking on a large EWLyα

value than UV bright galaxies.

We are inherently assuming that the underlying EWLyα distributions are the same at

z ∼ 6, where the De Barros et al. (2017) observations are made, and at z ∼ 7, where we will

apply the EWLyα distribution. Mason et al. (2018a) argue that, while oversimplifying, this

assumption is justified by the relatively short time frame; less than 200 Myr pass between

z = 6 and z = 7. Further, this means that any evolution in the EWLyα distribution

between z = 6 and z = 7 would be attributed to an evolving x̄HI in our model, as explored

in Mason et al. (2018a).

Lyα Line Profile

Since the Lyα optical depth due to the neutral IGM depends on the wavelength offset from

the line center, we need to assign a Lyα line profile that accounts for radiative transfer

effects in the galaxies’ interstellar medium (ISM) and circumgalactic medium (CGM). We

will take a relatively simple approach in modelling the Lyα line profiles, modifying the

methods in Mason et al. (2018a).


9

Figure 2: The log 2D distribution of galaxies at z=6.6 in the MUV − EWLyα plane. The
colored lines denoted in the legend indicate traces of constant limiting Lyα line flux in
ergs s−1 cm−2. A survey with a given line flux limit could detect LAEs which lie above
the trace corresponding to its flux limit. Note that there is an abundance of galaxies
at EWLyα=0 along the very bottom of the distribution, corresponding to the non-Lyα
emitting population.


10

We assume that z ∼ 6 Lyα line profiles are statistically the same as those at z = 7

when ignoring attenuation from a neutral IGM. Under this assumption, we take observed

Lyα line properties at z ∼ 6, correlate them with observable galaxy properties, and assign

them to our z = 7 halos. The line profiles we will assign, then, have the effects of the ISM

and CGM built-in. Attenuation by the IGM could change the observed Lyα line profile

statistics at z = 7.

We assume that the line profile modified by the ISM and CGM is a Gaussian with an

offset with respect to the rest-frame line center (∆vLyα). Unlike Mason et al. (2018a), we

assign the velocity offset with respect to the galaxy rest-frame using an empirical relation-

ship from Hayes et al. (2021), which is explained further in the next subsection.

The line’s full width half maximum (FWHM) is set equal to ∆vLyα, as found in Ver-

hamme et al. (2018). The Lyα line profiles are truncated blueward of the Lyα rest-frame

line center because anything blueward of Lyα line center will redshift into resonance and

be scattered by residual neutral fraction. The Lyα line profile after leaving the ISM and

CGM is

Jα(∆vLyα,Mh, v) ∝


1√

2πσα
e
−

(v−∆vLyα)2

2σ2
α if v ≥ 0

0 otherwise.

(2)

Note, though, that roughly one quarter of ultraluminous (log(LLyα/[ergs s
−1]) > 43.5)

LAEs at z > 6 emit flux blueward of line center (Hu et al., 2016; Songaila et al., 2018;

Meyer et al., 2020). These LAEs likely reside in large ionized bubbles in the IGM. These

bubbles may have been carved out by ionizing radiation from the ultraluminous LAEs

themselves, a population of nearby ionizing sources, an obscured quaser, or a combination

of contributors (see Bagley et al. (2017) and Mason & Gronke (2020) for a discussion).

Due to their rarity, we do not modify our line profiles to account for these ultraluminous

galaxies.

Hayes et al. (2021) compiled 229 Lyα selected galaxies at 2.9 < z < 6.6 from observa-

tions with the Very Large Telescope’s (VLT) Multi Unit Spectroscopic Explorer (MUSE)

and 74 galaxies at z < 0.44 from Hubble’s Cosmic Origins Spectrograph (COS) and com-

pared the two populations’ Lyα line profiles. The VLT galaxies’ Lyα velocity offsets relative

to their systemic redshifts (i.e. the line profiles’ 1st moments) were found by calibrating the

observations with low-z data and a model (Runnholm et al., 2021). These velocity offsets,


11

∆vLyα, and the Lyα luminosities are correlated. The data are described in more detail in

Hayes et al. (2021). Note that these galaxies were selected with the Lyα line, while the

De Barros et al. (2017) sample which provided the EWLyα distribution were selected with

the Lyman break. While not all LBGs have Lyα emission, we have captured this behavior

in the EWLyα distribution; those that do have Lyα emission may be detected in the Hayes

et al. (2021) sample, and so these empirical velocity offsets from Lyα selected galaxies may

be applied to LBGs as well. Note, however, that we find that the transmission fraction of

Lyα is largely insensitive to the exact velocity offset value in our model, provided that the

line has a width and so some of the flux is away from line center. For example, if we set

the velocity offsets to 0 km/s, drastically different from the typical values from our relation

of 200-300 km/s, we see a maximum of a 6% difference in the Lyα LFs in the range from

42.4 < log10(LLyα) < 43.7 for x̄HI=0.3 (this luminosity range was chosen because of the

observational range of the Lyα LFs at z = 7, see Figure 5). For x̄HI=0.6, this percent

difference grows to 20%.

We use emcee to fit a linear relation to the observed high-z LAEs’ line profiles’ ∆vLyαand

the log of their Lyα luminosities. The best fit linear relation is given by

∆vLyα = −3156+882
−895 + 80+21

−21 log10(LLyα) (3)

with a Gaussian scatter of 13.5+6.1
−3.5 km/s. The best fit relation, with the MUSE data, is

shown in Figure 3. The z < 3.5 MUSE data are shown in orange, and the z > 5.5 data

are shown in green. Both redshift ranges are consistent with the same linear relation,

indicating little to no redshift evolution. This linear relation is used to center the Gaussian

before truncation; random Gaussian scatter of 13.5 km/s is added to the offsets.

As we will only be concerned with Lyα luminosities larger than LLyα ≳ 1042 ergs s−1,

the fact that our relation will predict a negative velocity offset at Lyα luminosities ≲

4× 1039 ergs s−1 is not an issue.

With each halo assigned aMUV, EWLyα, Lyα optical depth (τIGM), and Lyα line profile,

we have everything we need to compute the transmitted fraction of Lyα flux. Explicitly,

the fractional transmission through the IGM is:

T =

∫∞
0 Jαe

−τIGMdv∫∞
0 Jαdv

. (4)


12

Figure 3: We show the 1st moment of the Lyα lines versus the log of the Lyα luminosity
for the MUSE data at z < 3.5 (green) and z > 5.5 (red). Our linear fit is shown with the
blue line, while the black hashed lines show the 1− σ Gaussian scatter about the relation
that we fit with emcee.

2.4 Validation

With all the pieces required to simulate Lyα-emitters in place, we proceed to validate

the recipes by comparing the simulation’s z = 5.7 Lyα LF to the observations. At this

redshift, the dark pixel fraction in the Lyα and Lyβ forest indicates that reionization is

nearly complete (Qin et al., 2021), but even with a neutral fraction of x̄HI∼ 10% the impact

on Lyα transmission is negligible (Mason et al., 2018a). As such, this comparison will test

only the Mh-MUV and MUV-EWLyα relations.

The z = 5.7 Lyα LF is shown in Figure 4 along with observations from Hu et al.

(2010); Ouchi et al. (2010); Konno et al. (2018). The halos have been grown to z = 5.7 as

explained in Section 2.2 and the UV luminosities are assigned from these evolved masses.

The EWLyα are assigned via the relation described in Section 2.3. The resulting z = 5.7

Lyα LF is in very good agreement with observations, lending credit to the recipes we have

used thus far.

This work will remain focused on the impact and mitigation of cosmic variance on the

inference of x̄HI, particularly at z = 7. A reader interested in analytic modelling of the Lyα


13

Figure 4: The z = 5.7 Lyα LF is shown in blue with constraints from Konno et al. (2018)
and Ouchi et al. (2010). We see very good agreement between our simulation and the
observational constraints.

LFs at other redshifts should consult Morales et al. (2021), which uses similar methods as

in this paper.

2.5 Full Volume z=7 Lyα LFs

We used the elements discussed in the previous sections to compute the z = 7 Lyα LFs

from the full ∼ 4.1 cGpc3 simulation volume, and for all values of x̄HI ranging from 0.01

to 0.92, with spacing of 0.03. Using the entire volume allows us to minimize shot noise in

the range of x̄HI and Lyα luminosity of interest. For each value of x̄HI, we recalculate the

simulation 30 times. We take the median of the 30 realizations as the true model; this has

the effect of further minimizing shot noise and sampling the sources of stochasticity from

our empirical relations. The resulting Lyα LFs are our best estimate of the true LAE LF

at z = 7 for various x̄HI values. In Section 3.2, we will compare mock observations to these

forward models to see how well the observations constrain x̄HI.

Figure 5 shows the LAE LF calculated from the full simulation volume for various x̄HI

values. As x̄HI increases, attenuation of intrinsic Lyα emission increases and the resulting

LAE LF is suppressed; between a fully ionized universe and a universe with x̄HI ∼ 0.85,


14

Figure 5: The curves show the LAE LF after IGM attenuation for various global neutral
fractions, x̄HI, at z = 7. The shaded regions show the range in which 95% of 30 simulation
realizations reside. Observational data from Ota et al. (2017); Itoh et al. (2018); Hu et al.
(2019) are shown with various markers. The data have been offset by 0.01 with respect
to each other on the abscissa to aid with visualization. The observational data at z = 7
is consistent with a completely or mostly ionized IGM, as demonstrated with the data’s
agreement with the blue and orange curves. This, combined with the z = 5.7 Lyα LF
agreement, shows that the evolution of the z = 5.7 to z = 7 Lyα LF is explainable with
smaller halos and lower intrinsic Lyα emission, rather than an increase in global neutral
fraction in our simulation.

the suppression is about an order of magnitude. The observational data at z = 7 agree

well with a fully ionized universe, which implies that in our simulation, the evolution of

the Lyα LF from z = 5.7 to z = 7 in observations can be fully explained via halo growth

and changing intrinsic Lyα emission. Previous works, such as Dijkstra et al. (2014), have

argued that at least some of the Lyα LF evolution likely comes from evolution of galaxy

properties.

3 Cosmic Variance in LAEs

With our simulation constructed, we begin analysis to determine the importance of

cosmic variance on the inference of global neutral fraction. As laid out in the introduction,


15

Figure 6: The black points show a LAE LF measured from a 2 square degree survey with
perfect observations to a depth of fLyα > 1× 10−17 [ergs s−1 cm−2] and a redshift window
of 6.75 < z < 7.25. The model LFs for x̄HI = [0.01, 0.3, 0.5] are shown as the blue, black,
and green curves, respectively. The error bars are assumed to be Poissonian for N > 20,
otherwise they are tabulated from Gehrels (1986). The intrinsic scatter of the black points
about the 0.3 model LF, the input neutral fraction, comes from cosmic variance.

Lyα photons are easily scattered by neutral hydrogen–this has the effect of decreasing the

number of observed LAEs when neutral fraction of H i increases. This decreases the volume

density of observed LAEs. The differential change in the Lyα LF between redshifts has

been used to infer the global neutral hydrogen fraction of the universe (Malhotra & Rhoads,

2004; Ouchi et al., 2010; Kashikawa et al., 2011; Konno et al., 2014; Zheng et al., 2017;

Ota et al., 2017; Itoh et al., 2018; Konno et al., 2018; Hu et al., 2019; Wold et al., 2021;

Morales et al., 2021). Our goal is to quantify how cosmic variance impacts this inference

and explore mitigation strategies in survey design with an updated, empirical LAE model.

3.1 A Nominal LAE Survey

To gain some footing, we first consider a nominal survey, representative of volumes probed

by typical surveys carried out by ground-based telescopes equipped with narrowband filters.

We will “observe” our simulation with the specifics of the nominal survey and use Bayesian

inference to estimate x̄HI from the “observed” Lyα LF.


16

We simulate a circular survey covering 2 square degrees to a Lyα flux limit of fLyα >

1 × 10−17 ergs s−1 cm−2 (or LLyα ≳ 6 × 1042ergs s−1 at z = 7). Such a survey pushes

faint enough in flux/luminosity to detect LAEs fainter than the knee of the Lyα LF at

z=7 (Ota et al., 2017; Itoh et al., 2018; Hu et al., 2019). This depth is comparable to

those reached by narrowband surveys, and so provides an interesting baseline. Notably,

we will consider a LAE “observed” simply if it falls into the geometric cutout and exceeds

the minimum line flux limit; in this way, our survey ignores any effects of contamination

or incompleteness, and so any stochasticity on the inferred values of x̄HI between surveys

is the result of stochasticity in the number of observed LAEs alone. We impose a redshift

window of ∆z ∼ 0.5 centered at z = 7 by selecting galaxies in a slice of 160 cMpc within

the simulated cube. Such a survey has a volume of 7.58 × 106 cMpc3 at z = 7, a volume

comparable to z = 7 narrowband surveys which have larger areas but smaller redshift

window (Ota et al., 2017; Itoh et al., 2018; Hu et al., 2019; Wold et al., 2021). One should

keep in mind that the simulation is a snapshot at z = 7; there is no differential evolution

between the “front” end of our survey and the “back” end taken into account. This allows

us isolate the effect of cosmic variance by treating the global neutral fraction as a constant

over our survey window. We then have a well-defined global neutral fraction that should

be inferred in our simulation from the mock observations. Note, however, that this lack of

differential evolution is not important with regards to any specific LAE. The cumulative

optical depth is dominated by the gas nearby the galaxy, so much so that the neutral

patches with optical depth ≳ 1 are within ≲ 10 Mpc, corresponding to ∆z≈ 0.02 at z = 7

(Mesinger & Furlanetto, 2008a). This implies that the differential evolution is unimportant

when calculating the optical depth for a given LAE.

We create a z = 7 simulation with a realistic input global neutral fraction of x̄HI= 0.3

(Morales et al., 2021) and observe an area in a randomized location within the cube. This

randomized location can, in principal, have a different local neutral fraction than the input

neutral fraction, but note that we are interested in inferring the latter, rather than the

former. This survey results in a total sample of 274 LAEs. The LAE LF for one specific

realization is shown in Figure 6. We split the LAEs into bins of width ∆log(LLyα) = 0.1.

The model Lyα LFs (as measured from the entire simulation volume) for x̄HI= 0.01, 0.30

(input value), and 0.50 are shown as the blue, black, and green curves, respectively. The


17

error bars associated with the simulated LF measurements come from Poisson statistics on

the counts in each bin if NLAE > 20. For smaller numbers of LAEs, we use the estimates

tabulated in Gehrels (1986) for small numer Poisson errors.

3.2 Inference on x̄HI from the Nominal Survey

In order to speed up the inference on x̄HI, we fit a function to the model LAE LFs

from Section 2.5 as a function of x̄HI. We assume the following 2D functional form, which

reproduces the logs of the modelled Lyα LFs to 4% accuracy with a median -0.1 % difference

across all LAE LF bins and neutral fractions:

log10(Φ) = β0 + β1x + β2y + β3x
2 + β4x

2y + β5x
2y2

+β6y
2 + β7xy

2 + β8xy
(5)

where x is the log of the Lyα luminosity and y is x̄HI. The luminosity functions are in

units [cMpc−3 logL−1] and the luminosities are in units [ergs s−1]. The β coefficients are

reported below and are unitless. The fit is valid in the range 0.01 < x̄HI < 0.91 and

42 < log (LLyα/[ergs s
−1]) < 44.

Table 1: Beta Parameters
β0 β1 β2 β3 β4 β5 β6 β7 β8

-2436.6 115.7 -2563.8 -1.4 -1.4 3.3 6146.7 -286.6 120.0

Some example LAE LFs using this fit for different x̄HI are visualized in Figure 6.

We use a Bayesian approach (implemented using the MCMC fitting from the pymc3

package) to estimate the best fit hydrogen neutral fraction, x̄HI, given our model Lyα LFs

(Section 2.5) and the simulated LF measurement (Section 3.1). The 2D surface fit is used

to create the predicted LAE volume densities, µ, for a given neutral fraction. Then, the

likelihood is given by

p(ϕL | x̄HI, τ) =
∏
i

√
τ

2π
exp(−τ

2
(ϕL,i − µi)

2) (6)

where the subscript i denotes each of the measured luminosity bins. The unknown model

parameters to be fit are the precision, τ , of the observed LAE LF data and x̄HI. We consider

a Gamma function prior on τ with α = β = 1 and a flat prior on x̄HI.


18

Figure 7: The posterior on x̄HI inferred from the LAE LF measured in Figure 6 is shown in
blue. The simulation’s input x̄HI, 0.3, is indicated with the vertical dashed black line. The
posterior favors a slightly lower x̄HI than the true value because the measured region of
space happened to have a slightly over-dense number of LAEs. The large width of posterior
indicates that x̄HI is not very well constrained; 95% of the posterior is contained in the
range 0.01 < x̄HI < 0.45.

The posterior distribution function for x̄HI is then given by

p(x̄HI | ϕL) ∝
∫

p(ϕL | x̄HI, τ)p(τ)p(x̄HI)dτ (7)

where p(τ) is the Gamma prior on τ and p(x̄HI) is the uniform prior on x̄HI.

We note that we are assuming here that the precision of all of the LF measurements,

τ , is the same. The results of our paper do not change if we instead assign adaptive

luminosity bins to ensure that this assumption is true, that the precision between all the

measurements is the same, i.e. the same number of LAEs go into each bin. We use fixed

bin sizes in log(LLyα/[ergs s−1]), rather than these adaptive bins, for ease of comparison

to measurements in literature.

The posterior on x̄HI for the LAE LF shown in Figure 6 is displayed in Figure 7. The

input neutral fraction is denoted with the vertical black dashed line. We can see that the

width of the posterior is rather wide and the posterior PDF continues increasing toward

low-values of x̄HI. The posterior’s median is indicated with the vertical dashed blue line–it

underestimates the real neutral hydrogen fraction. Looking back at the LAE LF in Figure

6, the origin of this offset is apparent: more of the observed LF data points lie above the

input x̄HI than below it, and a higher number density of LAEs leads to an inference of


19

a lower x̄HI. In other words, the global neutral fraction is underestimated because the

number of LAEs in the specific realization of the nominal area is higher than the predicted

number, i.e. the randomized survey position fell on a slightly over-dense region of the

simulation (with respect to LAEs).

Motivated by this result, in the following sections we explore the range of LAE densities

expected at z = 7, quantify how this range of environments propagates to a range on the

inferred x̄HI, and then test methods to mitigate the effect of varying environments on the

inference of x̄HI.

3.3 The Effect of Cosmic Variance on x̄HI Inference

We simulate 10,000 realizations of the nominal survey scattered randomly about the sim-

ulation volume, always with a global neutral fraction x̄HI = 0.3, and consider the total

number of LAEs, NLAE, in each realization which meet the detection threshold. The dis-

tribution of NLAE is shown in blue in the left panel of Figure 8. For comparison, a Poisson

distribution with the same population mean is shown in black; two particular survey re-

alizations, one at the 2.5 and one at the 97.5 percentile of the distribution of NLAE, are

noted with green and orange vertical lines, respectively. We will look at these particular

realizations to determine a conservative (i.e. covering 95% of surveys) absolute variance in

x̄HI inferred from the different over- and under-dense environments.

We calculate the posteriors on x̄HI from the LAE LF measured from the over- and under-

dense regions and show them in the right panel of Figure 8; orange shows the posterior on

x̄HI inferred from the LAE rich region, and green shows x̄HI inferred from the LAE sparse

region. The difference between the two medians is quite large, about ∆x̄HI ∼ 0.19. This

difference is critically important, as it is impossible to know a priori if a particular survey

has landed in a region which is over-dense, under-dense, or something close to the average

density, and this has a direct effect on the inferred x̄HI. This issue is still present for studies

which use the differential evolution of the LAE LF to infer x̄HI, because we cannot know

if the comparative surveys are in over- or under-dense regions of the universe with respect

to LAEs.

This inference was done in a very ideal scenario, where there is no impurity, no incom-

pleteness, and the model LAE LFs we are comparing to our “observations” are from the


20

150 200 250 300
NLAE

0.000

0.005

0.010

0.015

0.020

0.025
Fr

eq
ue

nc
y Area = 2 sq. deg.

Low NLAE (2.5%)
High NLAE (97.5%)
Poisson with Sample Mean
Simulated Realizations

0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0
xHI

0.0

0.4

0.8

1.2

1.6

2.0

2.4

2.8

3.2

3.6

4.0

p(
x H

I
L L

y
)

Low NLAEs (2.5%)
High NLAEs (97.5%)
0.3, Input xHI

Figure 8: The left panel shows the distribution of the number of LAEs observed, NLAE, in
10,000 realizations of the nominal 2 square degree survey in a x̄HI=0.3 at z = 7 universe
in blue. We investigate one over-dense region (orange vertical line) and one under-dense
region (green vertical line) further. A Poisson distribution with the same sample mean
is shown in black–the excess variance over the Poisson distribution is due to the clumpy
nature of the large-scale-structure and the rarity of LAEs. The right panel shows the
posteriors on x̄HI for the extrema of the 95th percentile range of LAE densities in orange
(high LAE density) and green (low LAE density). The simulation’s input x̄HI is shown with
the vertical black dashed line. The difference between the medians of these distributions
is ∆ ∼ 0.19.

same simulation, and so we know we have the physics in the modelling precisely correct.

In reality, there would be additional uncertainty entering from the fact that comparison

models will make assumptions on the underlying physics. Despite this extremely optimistic

scenario, ∆x̄HI ∼ 0.19 at z = 7 when inferring from our nominal survey’s LAE LF. It is im-

portant to note that this uncertainty in x̄HI is a function of x̄HI itself, ∆x̄HI(x̄HI). We find,

in agreement with Mason et al. (2018b) that the uncertainty decreases as x̄HI increases,

owing to a narrower distribution of ionized bubbles sizes at high x̄HI values. We save a

full quantification of ∆x̄HI(x̄HI), akin to the work done for quasars and GRBs in Mesinger

& Furlanetto (2008a), for future work, and instead take the uncertainty at x̄HI= 0.3 as a

baseline and investigate mitigation strategies.

It is noteworthy that this nominal survey is already fairly large and pushes to moderate

flux depths. Taking into account the considered redshift window, 6.75 < z < 7.25, the

survey’s volume, 7.58×106cMpc3, is larger than many of the z = 7 LAE surveys observed to


21

date, and comparable to Wold et al. (2021). Larger volumes, deeper surveys, or independent

fields may mitigate this systematic uncertainty, so we now explore how ∆x̄HI changes when

varying the survey strategy.

3.4 Mitigating the Systematics

We consider three scenarios to mitigate systematic uncertainty on the inference of x̄HI :

increasing the survey area, increasing the survey depth, and considering a survey composed

of many independent fields.

Increasing Area

To determine survey areas which may be effective at decreasing the effect of cosmic variance

on the inference of x̄HI, we consider four increasingly large areas centered on the two

positions identified in Figure 8, i.e. an over-dense and an under-dense region. We simulate

areas of 2, 5, 10, and 20 square degrees, while keeping the survey line-flux limit fixed to

the nominal survey. For each survey, we use the same procedure described in Section 3.2

to infer the neutral hydrogen fraction. The resulting posteriors for both the over- and

under-dense regions are shown in Figure 9.

Figure 9 shows that increasing the area by a factor of 5 to 10 has a tendency to modestly

decrease the width of the posterior–in the LAE sparse region, the standard deviation of

the 2 square degree survey’s posterior is 0.15, while the 20 square degree survey’s standard

deviation is 0.13.

We take the extremely optimistic case, an observationally perfect 20 square degree

survey with ∆z = 0.5 and luminosity limit LLyα > 6 × 1042 ergs s−1 and investigate it

further.

We compute 100 realizations of this survey, randomly positioning them within the

simulated volume, and infer x̄HI from each. We calculate the median for each of these 100

posteriors and plot the distribution in Figure 10. For a baseline comparison, we repeat the

same procedure with the nominal 2 square degree survey (which has the same flux limit

as this 20 square degree survey); 100 realizations are created and each is used to make an

inference on x̄HI. The medians of the 100 nominal survey posteriors are plotted in Figure

10 as the underlying black distribution. 95% of the nominal survey’s medians are within


22

0.0 0.2 0.4 0.6 0.8
xHI

0.0

0.4

0.8

1.2

1.6

2.0

2.4

2.8

3.2

3.6

4.0

p(
x H

I
L) Low NLAE Region

Area: 2.0 sq. deg.
Area: 5.0 sq. deg.
Area: 10.0 sq. deg.
Area: 20.0 sq. deg.
0.3, Input xHI

0.0 0.2 0.4 0.6 0.8 1.0
xHI

0.0

0.4

0.8

1.2

1.6

2.0

2.4

2.8

3.2

3.6

4.0

High NLAE Region

Area: 2.0 sq. deg.
Area: 5.0 sq. deg.
Area: 10.0 sq. deg.
Area: 20.0 sq. deg.
0.3, Input xHI

Figure 9: We increase the area of the nominal surveys on the previously considered density
extrema locations while keeping the luminosity limit and redshift window constant. The
posteriors in the low NLAE region flatten out at a non-zero value because of a single bright
LAE which happened to fall within the survey. Increasing the area to 20 square degrees
centers moves the peak of the posterior towards the true value of x̄HI, so we will further
investigate 20 square degree surveys to determine the intrinsic spread in inferred x̄HI from
cosmic variance.


23

Figure 10: The distribution of 100 posterior medians for the 20 square degree survey (blue)
and nominal 2 square degree survey (black) for MCMC analyses. Both of these surveys have
∆z = 0.5 and luminosity limit LLyα > 6× 1042[ergs s−1]. The area of the 20 square degree
survey was chosen because it seemed to remove the offset in inferred x̄HI with respect to the
true value seen in Figure 9. The scatter in the posteriors medians is consistent between the
two survey strategies, indicating that even increasing the area of the survey by a factor of
10 is not enough to overcome the fact that the uncertainty on x̄HI is dominated by cosmic
variance and imposes an uncertainty ∆x̄HI ∼ 0.2.


24

a range ∆x̄HI= 0.25, while the 20 square degree medians have 95% of their realizations

within ∆x̄HI= 0.21. The width of the distribution of these medians gives us a sense of

the systematic uncertainty in x̄HI arising from cosmic variance between the two survey

strategies; there is an extremely modest tightening of posteriors for the 20 square degree

surveys compared to the 2 square degree surveys, which does not seem to be statistically

significant. It is important to emphasize that the narrowing of the posteriors in Figure 9

was for one particular survey only, whereas Figure 10 shows that the uncertainty arising

from cosmic variance is not improved by increasing the area. This is demonstrated by the

fact that the distributions of the medians of the 100 posteriors for the 2 and 20 square

degrees surveys are approximately equivalent.

Thus, even increasing our area by a factor of 10x to 20 square degrees, considering

a fairly wide redshift window, and detecting LAEs fainter than the luminosity function

knee at z=7 is not enough to significantly reduce the uncertainty on x̄HI. For surveys

larger than 20 square degrees, we begin to get to the point where even our extremely large

simulation cannot provide a large number of independent fields of view; we thus cannot

conclusively say what volume is necessary to drive the uncertainty on x̄HI below ∼ 0.2 for

a survey with flux limit fLyα > 1 × 10−17 ergs s−1 cm−2, we can only say that it is larger

than 7.58 × 107 cMpc3, a significantly larger volume than those which have been probed

in LAE surveys so far. Evidently, simply taking a survey with a larger area is not enough

to mitigate the effect of cosmic variance.

Increasing Depth

One may wonder what effect additional depth would have on the inference of x̄HI, motivated

by the fact that less luminous LAEs will generally occupy less massive halos, and so be less

biased, and thus be less sensitive to cosmic variance. To investigate this idea, we repeat the

exercise from Section 3.4, but modify the survey–we instead consider a 10 square degree

survey with a flux limit of fLyα > 3 × 10−18 ergs s−1 cm−2 (or LLyα ≳ 1.8 × 1042ergs s−1

at z = 7). This survey’s flux limit is about 3.3x deeper than the nominal survey and it is

5x larger, so it would take about 55x as long to execute. Going deeper in flux results in a

drastic increase in the number of LAEs observed due to the shape of the luminosity function;

going from a flux limit of fLyα > 1× 10−17 ergs s−1 cm−2 to fLyα > 3× 10−18 ergs s−1 cm−2


25

Figure 11: The distribution of 100 posterior medians for the deep (LLyα > 1.8 ×
1042[ergss−1]) 10 square degree surveys (blue) and the posterior medians of the 100 nominal
surveys (black). For both survey strategies ∆z = 0.5. The widths of the distributions of
both survey strategies is about the same; 95% of the nominal survey’s medians are within
a range ∆x̄HI= 0.25, while the deep 10 square degree medians have 95% of their realiza-
tions within ∆x̄HI= 0.23. This indicates that pushing to deeper flux limits is not a viable
strategy in mitigating the effect of cosmic variance.

increases the number of observed LAEs by about a factor of 10. Following the steps from

the previous section, we calculate the posteriors on x̄HI from 100 realizations of such a

survey; again, the distribution of the medians for the 100 surveys are shown in Figure

11. Again, the distribution for the 100 nominal surveys is shown in black as a baseline to

compare against.

The width which encompass 95% of the posterior medians for the deep survey is vir-

tually the same as the nominal survey, ∆x̄HI= 0.23. Thus, even pushing ∼ 3.3x deeper in

flux and increasing the area by 5x over the nominal survey, requiring ∼55x more exposure

time, has not decreased the systematic uncertainty.

Independent Fields

Lastly, we consider a survey composed of many independent fields, the motivation being

that each field will probe a different environment for the LAEs, averaging out high- and


26

low-density environments.

Further motivation can be seen examining Figure 12. The left panel shows the dis-

tributions of the number of LAEs per square degree observed by 1000 surveys of areas

2 and 20 square degrees (and to depth LLyα ≳ 6 × 1042ergs s−1). The blue dashed line

indicates a 2 square degree survey realization centered on an under-dense region (at the 2.5

percentile) and the red dashed line shows a 20 square degree survey centered on that same

position. As expected, the distributions of the number of LAEs per square degree con-

verges towards the overall simulation mean as the survey area increases; the square degree

surveys’ distribution (red) is far narrower than the two square degree surveys’ (blue).

However, the imprint of the under-dense region identified with a 2 square degree survey

is still dominating the number of LAEs observed in the 20 square degree survey when

compared to the parent distribution, as shown in the right panel. The 20 square degree

survey centered on the under-dense region (dashed red vertical line) is extremely close to

the parent distribution’s 2.5 percentile (solid black vertical line).

This is indicative of the fact that adding area to a given survey has a tendency to add

the average number of LAEs per square degree, by definition. This will have no effect on

moving a given realization towards the center of the distribution of NLAE for the parent

distribution of that survey size; only adding an under-dense or over-dense region changes

its position relative to the parent distribution. Then, a survey centered on an under-dense

(over-dense) region will continue to exhibit low (high) LAE counts relative to other similar

surveys, even as the area increases drastically.

Instead, we simulate a survey of 100 independent 0.2 square degree fields, totalling

20 square degrees, to a depth of fLyα > 3 × 10−18 ergs s−1 cm−2, the same total area as

our largest survey and as deep as our deepest simulated survey. All of these surveys are

simulated in a simulation with global neutral fraction x̄HI=0.3. Each of the 100 0.2 square

degree fields has the inference done on them separately, and the resulting 100 posteriors

are multiplied together to form a joint posterior on x̄HI.

Running a survey in this particular manner is computationally expensive, and so we

limit ourselves to 90 realizations of such a survey; we show the distribution of the posterior

medians in Figure 13. The width which encompasses 95% of the surveys has decreased–

∆x̄HI= 0.05. Probing many independent fields is the only effective way to decrease the


27

80 100 120 140 160
NLAE/sq. deg.

0.00

0.01

0.02

0.03

0.04

0.05

0.06

0.07

Fr
eq

ue
nc

y

2 sq. deg. Low Anchor
20 sq. deg. Low Anchor
2 sq. deg
20 sq. deg.

2000 2100 2200 2300 2400 2500 2600 2700 2800
NLAE

0.000

0.001

0.002

0.003

0.004

0.005

Fr
eq

ue
nc

y

20sq. deg. Low Anchor
Distribution 2.5 Percentile
20 sq. deg.
100 0.2 sq. deg. Fields

Figure 12: The left panel shows the distribution of the number of LAEs per square de-
gree observed by 1000 surveys of areas 2 and 20 square degrees (and to depth LLyα ≳
6 × 1042ergs s−1) in blue and red, respectively. Surveys for each area, centered on the
under-dense region identified in the nominal 2 square degree survey, are indicated with the
dashed vertical lines of the same colors. The distributions of NLAE per square degree tend
to narrow as survey area increases. The right panel shows the distribution of the NLAE

observed in 1000 realizations of the 20 square degree. The underlying green distribution
shows the distribution of NLAE in 300 surveys of 100 0.2 square degree fields, totalling
20 square degrees. This survey strategy will be discussed more in Section 3.4; note that
the distribution of NLAE is narrower for this strategy. The dashed red vertical line again
indicates the under-dense region selected with the 2 square degree survey, and the distri-
bution’s 2.5 percentile is indicated with the solid black vertical line. The imprint of the
under-dense region identified with a 2 square degree survey is still dominating the 20 square
degree survey’s total number of LAEs compared to other 20 square degree surveys, illus-
trated by the fact that the red dashed line does not move much towards the distribution’s
mean.


28

Figure 13: The distribution of posterior medians for 90 surveys composed of 100 0.2 square
degree fields to a depth of LLyα ≳ 1.8× 1042ergs s−1 is shown in blue. The distribution of
the 100 nominal survey’s medians is shown in black. 95% of the nominal survey’s medians
are within a range ∆x̄HI= 0.25, while the 90 surveys composed of 100 independent fields
have 95% of their realizations within ∆x̄HI= 0.05. This strategy is effective in reducing the
systematic uncertainty on x̄HI.

systematic uncertainty introduced by cosmic variance, decreasing the 95% width by about

a factor of 4. Note that the inferred global neutral fraction is biased slightly high by about

0.03. This is because the analytical fit to the model Lyα LFs is only accurate to 4%.

In practice, it would be possible to remove this bias by bypassing the step of fitting the

LFs analytically; however, for our analysis we need to compare many observing strategies

and do hundreds of instances of inference for each. The analytic function speeds up the

inference process and is the only way to complete our tests in a feasible amount of time.

This method has an additional advantage; Mesinger & Furlanetto (2008b) showed that

counts-in-cell statistics can be an effective way to use the enhanced clustering of LAEs

during reionization to constrain x̄HI. While the counts-in-cells method does not strictly

require that the fields be discontiguous, a survey strategy such as ours has the additional

benefit of overcoming any effect of cosmic variance, which could impact a clustering analysis

as well. Thus, such a survey would have two methods of constraining x̄HI, both of which are

robust against cosmic variance, which could be checked against each other for consistency.


29

The Implication of the Mitigation Techniques

We are left to conclude that the only effective way to break through the floor in the

uncertainty in x̄HI arising from the cosmic variance of LAEs is with surveys composed

of many independent fields. Drastically increasing the survey area and pushing to much

deeper fluxes alone proved ineffective. Of note, the survey composed of independent fields,

with a flux limit of fLyα > 3 × 10−18 ergs s−1 cm−2 and an area of 20 square degrees

would take ∼ 110 times as long to carry out as the nominal survey, which is currently

representative of the largest dedicated LAE surveys to date–thus, this serves more as a

proof of concept than a realistic goal for a survey design in the immediate future.

It is worth keeping in mind that an uncertainty of ∆x̄HI= 0.2, as we have demonstrated

is characteristic when the universe has x̄HI= 0.3, is still a competitive constraint. Particu-

larly, if such measurements are made at a range of redshifts, we may be able to construct a

fairly constraining timeline of reionization; thus, the LAE LF will continue to be a useful

tool in constraining reionization, in addition to other techniques, such as damping wings

in quasars and GRBs, LAE clustering, LAE fraction, and Lyα EW distributions.

4 Summary

We have post-processed a z = 7 comoving 4.1Gpc3 simulation to create a mock LAE

catalog using empirical relations. Our simulation simultaneously reproduces the z = 6 and

z = 7 UV LFs and the z = 5.7 Lyα LF.

We produce z = 7 Lyα LF models, ranging from 42 < log(LLyα/[ergs s
−1]) < 44, as a

function of x̄HI. We created mock surveys with different observation strategies and compare

the measured Lyα LFs to these models, inferring posteriors on x̄HI. The observed LAE LF

varies, deriving from the large scale structure and stochasticity in the amount of neutral

gas along the sightline between the LAEs and the observer, and the shape and position of

the peak of the posterior distribution function on x̄HI changes in turn.

We show that the precision of x̄HI estimates drawn from an LAE LF derived from a

single 2 square degree field with fLyα > 1 × 10−17 ergs s−1 cm−2 is limited by the cosmic

variance of LAEs. We find an uncertainty floor of ∆x̄HI ∼ 0.2 when x̄HI = 0.3. Note that a

2 square degree survey with a redshift window of ∆z = 0.5 has a volume of 7.58×106cMpc3,


30

about the same size as existing narrowband LAE surveys.

We investigated three methods to push this floor in the uncertainty down. (1) Increasing

the area covered by a factor of 10, (2) pushing to ∼ 3.3x deeper flux limits, and (3) breaking

up the survey into independent fields. For the added area and deeper survey strategies,

the variance in the medians of the posteriors on x̄HI remains virtually unchanged from the

nominal 2 square degree survey. This demonstrates that the x̄HI inferred from observed

LFs in contiguous fields are largely at the whim of what cosmic environment one’s survey

happened to land in.

However, a large, deep survey composed of smaller independent fields proved effective

in reducing the systematic uncertainty on x̄HI. Under this strategy, we see width of 95%

of the posterior medians of x̄HI decrease from ∆x̄HI ∼ 0.2 to ∆x̄HI ∼ 0.05. Thus, probing

multiple independent fields is critically important in constraining x̄HI.

This result demonstrates that surveys of LAEs aiming at constraining x̄HI should adopt

a strategy similar to the one proposed in Section 3.4 in order to minimize the systematic

uncertainty on x̄HI. The Parallel Application of Slitless Spectroscopy to Analyze Galaxy

Evolution (PASSAGE, PI M. Malkan) JWST survey, as a pure parallel survey, follow this

strategy, though JWST’s small field-of-view implies that the total area will be much smaller

than the survey proposed in Section 3.4. PASSAGE is being carried out during Cycle 1

of JWST observations and its observations of high-z LAES will allow the inference of new

information about the timeline of reionization. The Nancy Grace Roman Space Telescope,

with its large field of view, is a very good candidate to run a survey analogous to the one

discussed in Section 3.4, though with a higher flux limit.


References

Atek, H., Richard, J., Jauzac, M., et al. 2015, arXiv:1509.06764 [astro-ph],

arXiv:1509.06764

Bagley, M. B., Scarlata, C., Henry, A., et al. 2017, ApJ, 837, 11

Becker, G. D., D’Aloisio, A., Christenson, H. M., et al. 2021, Monthly Notices of the Royal

Astronomical Society, 508, 1853

Behroozi, P., Wechsler, R., Hearin, A., & Conroy, C. 2019, Monthly Notices of the Royal

Astronomical Society, 488, 3143

Bolton, J. S., & Haehnelt, M. G. 2013, Monthly Notices of the Royal Astronomical Society,

429, 1695

Bosman, S. E. I., Davies, F. B., Becker, G. D., et al. 2022, Hydrogen Reionisation

Ends by $z=5.3$: Lyman-$\alpha$ Optical Depth Measured by the XQR-30 Sample,

arXiv:2108.03699

Bouwens, R. J., Illingworth, G. D., Oesch, P. A., et al. 2015a, ApJ, 811, 140

—. 2011, ApJ, 737, 90

—. 2015b, ApJ, 803, 34

Bouwens, R. J., Oesch, P. A., Stefanon, M., et al. 2021, AJ, 162, 47

Bowler, R. A. A., Dunlop, J. S., McLure, R. J., et al. 2014, Monthly Notices of the Royal

Astronomical Society, 440, 2810

31


32

Castellano, M., Fontana, A., Boutsia, K., et al. 2010, A&A, 511, A20

Davies, F. B., Hennawi, J. F., Bañados, E., et al. 2018, ApJ, 864, 142

De Barros, S., Pentericci, L., Vanzella, E., et al. 2017, A&A, 608, A123

Dijkstra, M. 2014, Publ. Astron. Soc. Aust., 31, e040

—. 2017, 81

Dijkstra, M., Wyithe, S., Haiman, Z., Mesinger, A., & Pentericci, L. 2014, Monthly Notices

of the Royal Astronomical Society, 440, 3309

Finkelstein, S. L. 2016, Publ. Astron. Soc. Aust., 33, e037

Finkelstein, S. L., Ryan Jr., R. E., Papovich, C., et al. 2015, ApJ, 810, 71

Finkelstein, S. L., D’Aloisio, A., Paardekooper, J.-P., et al. 2019, ApJ, 879, 36

Gehrels, N. 1986, ApJ, 303, 336

Greig, B., Mesinger, A., & Bañados, E. 2019, Monthly Notices of the Royal Astronomical

Society, 484, 5094

Hayes, M. J., Runnholm, A., Gronke, M., & Scarlata, C. 2021, ApJ, 908, 36

Hoag, A., Bradač, M., Huang, K.-H., et al. 2019, ApJ, 878, 12

Hu, E. M., Cowie, L. L., Barger, A. J., et al. 2010, ApJ, 725, 394

Hu, E. M., Cowie, L. L., Songaila, A., et al. 2016, ApJ, 825, L7

Hu, W., Wang, J., Zheng, Z.-Y., et al. 2019, ApJ, 886, 90

Itoh, R., Ouchi, M., Zhang, H., et al. 2018, ApJ, 867, 46

Jung, I., Finkelstein, S. L., Livermore, R. C., et al. 2018, ApJ, 864, 103

Kashikawa, N., Shimasaku, K., Matsuda, Y., et al. 2011, ApJ, 734, 119

Konno, A., Ouchi, M., Ono, Y., et al. 2014, ApJ, 797, 16


33

Konno, A., Ouchi, M., Shibuya, T., et al. 2018, Publications of the Astronomical Society

of Japan, 70, arXiv:1705.01222

Livermore, R. C., Finkelstein, S. L., & Lotz, J. M. 2017, ApJ, 835, 113

Malhotra, S., & Rhoads, J. 2004, ApJ, 617, L5

—. 2006, ApJ, 647, L95

Mason, C., Trenti, M., & Treu, T. 2015, ApJ, 813, 21

Mason, C. A., & Gronke, M. 2020, Monthly Notices of the Royal Astronomical Society,

499, 1395

Mason, C. A., Treu, T., Dijkstra, M., et al. 2018a, ApJ, 856, 2

Mason, C. A., Treu, T., de Barros, S., et al. 2018b, ApJ, 857, L11

Mason, C. A., Fontana, A., Treu, T., et al. 2019, Monthly Notices of the Royal Astronomical

Society, 485, 3947

McLure, R. J., Dunlop, J. S., Bowler, R. A. A., et al. 2013, Monthly Notices of the Royal

Astronomical Society, 432, 2696

McQuinn, M., Lidz, A., Zahn, O., et al. 2007, Monthly Notices of the Royal Astronomical

Society, 377, 1043

McQuinn, M., Lidz, A., Zaldarriaga, M., Hernquist, L., & Dutta, S. 2008, Monthly Notices

of the Royal Astronomical Society, 388, 1101

Mesinger, A., Aykutalp, A., Vanzella, E., et al. 2015, Monthly Notices of the Royal Astro-

nomical Society, 446, 566

Mesinger, A., & Furlanetto, S. 2007, ApJ, 669, 663

—. 2008a, Lyman-Alpha Damping Wing Constraints on Inhomogeneous Reionization,

arXiv:0710.0371

—. 2008b, Monthly Notices RAS, 386, 1990


34

Mesinger, A., Furlanetto, S., & Cen, R. 2011, Monthly Notices of the Royal Astronomical

Society, 411, 955

Mesinger, A., Greig, B., & Sobacchi, E. 2016, Mon. Not. R. Astron. Soc., 459, 2342

Meyer, R. A., Laporte, N., Ellis, R. S., Verhamme, A., & Garel, T. 2020, Monthly Notices

of the Royal Astronomical Society, staa3216

Morales, A. M., Mason, C. A., Bruton, S., et al. 2021, ApJ, 919, 120

Naidu, R. P., Tacchella, S., Mason, C. A., et al. 2020, ApJ, 892, 109

Ono, Y., Ouchi, M., Harikane, Y., et al. 2018, Publications of the Astronomical Society of

Japan, 70, arXiv:1704.06004

Ota, K., Iye, M., Kashikawa, N., et al. 2017, ApJ, 844, 85

Ouchi, M., Mobasher, B., Shimasaku, K., et al. 2009, ApJ, 706, 1136

Ouchi, M., Shimasaku, K., Furusawa, H., et al. 2010, ApJ, 723, 869

Ouchi, M., Harikane, Y., Shibuya, T., et al. 2018, Publications of the Astronomical Society

of Japan, 70, arXiv:1704.07455

Pentericci, L., Fontana, A., Vanzella, E., et al. 2011, ApJ, 743, 132

Planck-Collaboration, Ade, P. A. R., Aghanim, N., et al. 2016, A&A, 594, A13

Planck-Collaboration, Aghanim, N., Akrami, Y., et al. 2020, A&A, 641, A6

Qin, Y., Mesinger, A., Bosman, S. E. I., & Viel, M. 2021, Monthly Notices of the Royal

Astronomical Society, 506, 2390

Ren, K., Trenti, M., & Mason, C. A. 2019, ApJ, 878, 114

Rhoads, J. E., & Malhotra, S. 2001, The Astrophysical Journal, 563, L5

Robertson, B. E., Ellis, R. S., Furlanetto, S. R., & Dunlop, J. S. 2015, ApJ, 802, L19

Runnholm, A., Gronke, M., & Hayes, M. 2021, PASP, 133, 034507


35

Schenker, M. A., Robertson, B. E., Ellis, R. S., et al. 2013, ApJ, 768, 196

Songaila, A., Hu, E. M., Barger, A. J., et al. 2018, ApJ, 859, 91

Stark, D. P., Ellis, R. S., Chiu, K., Ouchi, M., & Bunker, A. 2010, Monthly Notices of the

Royal Astronomical Society, 408, 1628

Tilvi, V., Papovich, C., Tran, K.-V. H., et al. 2013, ApJ, 768, 56

Treu, T., Schmidt, K. B., Trenti, M., Bradley, L. D., & Stiavelli, M. 2013, ApJ, 775, L29

Verhamme, A., Garel, T., Ventou, E., et al. 2018, Monthly Notices of the Royal Astronom-

ical Society: Letters, 478, L60

Whitler, L. R., Mason, C. A., Ren, K., et al. 2020, Monthly Notices of the Royal Astro-

nomical Society, 495, 3602

Wold, I. G. B., Malhotra, S., Rhoads, J., et al. 2021, arXiv:2105.12191 [astro-ph],

arXiv:2105.12191

Yoshioka, T., Kashikawa, N., Inoue, A. K., et al. 2022, arXiv:2201.07261 [astro-ph],

arXiv:2201.07261

Zheng, Z.-Y., Wang, J., Rhoads, J., et al. 2017, ApJ, 842, L22


	Abstract
	List of Tables
	List of Figures
	Introduction
	The Reionization Simulation and its Post-Processing
	The 21cmFAST Simulation
	Propagating Halo Growth
	Galaxy Properties
	Validation
	Full Volume z=7 Ly LFs

	Cosmic Variance in LAEs
	A Nominal LAE Survey
	Inference on HI from the Nominal Survey
	The Effect of Cosmic Variance on HI Inference
	Mitigating the Systematics

	Summary

	References