--- jupytext: text_representation: extension: .md format_name: myst format_version: 0.13 jupytext_version: 1.18.1 kernelspec: display_name: venv language: python name: python3 --- # Fitting a connective field model to fMRI data from a visual experiment **Author**: Malte Lüken (m.luken@esciencecenter.nl) This example shows how to fit a connective field (CF) model to blood oxygenation level-dependent (BOLD) functional magnetic resonance imaging (fMRI) data. A CF model maps neural activity in one brain region of interest (ROI; e.g., V2 in the human visual cortex) to activity in another brain region (e.g., V1). The CF is the location in the source region that predicts neural activity in the target region. In this example, we select V1 in the visual cortex as the source region. We do not specify a target region but instead map the entire brain to V1. Because prfmodel uses Keras for model fitting, we need to make sure that a backend is installed before we begin. In this example, we use the TensorFlow backend. ```{code-cell} ipython3 import os from importlib.util import find_spec # Set keras backend to 'tensorflow' (this is normally the default) os.environ["KERAS_BACKEND"] = "tensorflow" # Hide tensorflow info messages os.environ["TF_CPP_MIN_LOG_LEVEL"] = "1" if find_spec("tensorflow") is None: msg = "Could not find the tensorflow package. Please install tensorflow with 'pip install .[tensorflow]'" raise ImportError(msg) ``` ## Loading the surface Before we start modelling, we need to make sure that we have all the requirements to visualize the BOLD response data. To visualize fMRI data, we can project it onto an anatomical surface of the brain. In this tutorial, we use a template surface based on the surface of the Human Connectome Project (see https://doi.org/10.6084/m9.figshare.13372958). For visualization, we use the nilearn package. We download the surface and load the flat meshes for both hemispheres. ```{code-cell} ipython3 from prfmodel.examples import load_surface_mesh # Download surface mesh and load as nilearn.surface.PolyMesh object mesh = load_surface_mesh(dest_dir="data", surface_type="flat") ``` ## Loading the BOLD response Now that we have the flat surface, we load the raw BOLD response data from a single subject that can be represented on the surface mesh. We load data from both hemispheres of the subject by setting `hemisphere="both"`. ```{code-cell} ipython3 %matplotlib inline import matplotlib.pyplot as plt from prfmodel.examples import load_single_subject_fmri_data response_raw = load_single_subject_fmri_data(dest_dir="data", hemisphere="both") response_raw.shape # shape (num_vertices, num_frames) ``` The BOLD response time courses have 120 time frames. We convert the raw BOLD response signal to percent signal change (PSC). ```{code-cell} ipython3 response_psc = ((response_raw.T / response_raw.mean(axis=1)).T - 1.0) * 100.0 ``` The `response_psc` object contains the BOLD response timecourse for each voxel on the surface mesh. It has shape `(num_vertices, num_frames)` where `num_vertices` is the number of vertices on the surface mesh and `num_frames` the number of time frames of the recording. The timecourses from the left and right hemispheres are concatenated (left is first). Because the data has been converted to PSC, each timecourse should have a mean of zero. First, we define a helper function to plot a statistic on the surface. ```{code-cell} ipython3 import numpy as np from nilearn.plotting import plot_surf_stat_map SURF_VIEW = (90, 270) def plot_surf_stat_map_helper( stat_map: np.ndarray, vmin: float | None = None, vmax: float | None = None, title: str | None = None, ) -> tuple[plt.Figure, plt.Axes]: """Helper function to plot a surface with a stat map.""" fig, ax = plt.subplots(subplot_kw={"projection": "3d"}, figsize=(8, 6)) plot_surf_stat_map( mesh, stat_map, vmin=vmin, vmax=vmax, hemi="both", view=SURF_VIEW, cmap="inferno", axes=ax, figure=fig, title=title, ) # Expand the 3D axes to fill the space to the left of the colorbar surf_axes = [a for a in fig.axes if a.name == "3d"] cbar_axes = [a for a in fig.axes if a.name != "3d"] cbar_x0 = min(a.get_position().x0 for a in cbar_axes) for a in surf_axes: a.set_position([0.0, 0.0, cbar_x0 - 0.01, 0.97]) return fig, ax ``` We can then plot the standard deviation of each timecourse on the surface. ```{code-cell} ipython3 # Calculate standard deviation of each timecourse response_sd = response_psc.std(axis=1) plot_surf_stat_map_helper(response_sd, vmax=5.0, title="Response standard deviation"); ``` We can see that there is high variation in the signal in the visual areas (e.g., V1-V3) as we would expect for a visual stimulus. +++ ## Inspecting the data Before we model the BOLD response, we look at the timecourses and inspect the quality of the data. We start by looking at the timecourses of our source region V1. We select the vertices that have been assigned to V1 through a multi-modal parcellation (MMP) atlas based on Glasser et al. (2016). ```{code-cell} ipython3 import plotly.io as pio import plotly.express as px pio.renderers.default = "notebook_connected" # Requires internet connection to work pio.templates.default = "simple_white" # The atlas comes with the surface mesh atlas = np.load("data/hcp_999999/surface-info/mmp_atlas.npz") # V1 has label 1 label_v1 = 1 # Concatenate labels from both hemispheres (left is first) is_v1 = np.concatenate([ atlas["left"] == label_v1, atlas["right"] == label_v1, ]) ``` We then plot the BOLD response for each vertex in V1 over time. Note that the unit of time frames is repetition time (TR) which is 1.5 seconds for this dataset. ```{code-cell} ipython3 response_psc_v1 = response_psc[is_v1] fig = px.line( response_psc_v1[::10].T, animation_frame="variable", range_x=(0, 120), range_y=(-5, 5), labels={ "index": "Time frame (in TR)", "value": "BOLD response (in PSC)", "variable": "V1 vertex", }, title="V1 vertex timecourses", ) fig.update_layout(showlegend=False, height=450) fig.show() ``` While the timecourses are quite noisy, we can see that, for some vertices, there are four regular peaks in the signal. These peaks originate from the visual stimulus that was shown during the fMRI recording. The visual stimulus contained four corresponding bars moving through the visual field in different directions (see [](prf_2d_fmri_visual.md)). The goal of our CF model is to map these responses to the responses from other vertices in the brain. We can get an overview of all vertices by plotting all timecourses at once in a heatmap. ```{code-cell} ipython3 aspect_ratio = response_psc.shape[1] / response_psc.shape[0] fig, ax = plt.subplots(1, 1, figsize=(8, 6)) im = ax.imshow( response_psc, aspect=aspect_ratio, cmap="inferno", vmin=-2, vmax=5, ) ax.set_xlabel("Time frame (in TR)") ax.set_ylabel("Vertex index") fig.colorbar(im, ax=ax, label="BOLD response (in PSC)"); ``` ## Creating the Distance Matrix The input for CF models is the response in the source region (which we have already identified) and a distance matrix between vertices in that region. To compute this distance matrix we define a helper function that loads a white matter surface from a GIFTI file, transforms the surface into a sparse graph, and applies Dijkstra's algorithm to find the weighted shortest paths between all vertices in our source region. > **Note - distance matrix construction:** There are alternative ways of constructing a distance matrix on the cortical > surface. For example, the [Heat method](https://www.cs.cmu.edu/~kmcrane/Projects/HeatMethod/) computes the geodesic distance between vertices on the surface. ```{code-cell} ipython3 from collections.abc import Sequence import nibabel as nib from scipy.sparse import csr_matrix from scipy.sparse.csgraph import dijkstra def calculate_distance_matrix_dijkstra(surface_path: str, source_indices: Sequence) -> np.ndarray: """Calculate the distance matrix for a sequence of source region vertex indices from a surface file.""" img = nib.load(surface_path) coords, triangles = img.agg_data() i, j, k = triangles.T edges = np.vstack((np.column_stack((i, j)), np.column_stack((j, k)), np.column_stack((k, i)))) v_start = coords[edges[:, 0]] v_end = coords[edges[:, 1]] dist = np.linalg.norm(v_start - v_end, axis=1) n_verts = coords.shape[0] graph = csr_matrix((dist, (edges[:, 0], edges[:, 1])), shape=(n_verts, n_verts)) vert_is_source = np.isin(np.arange(n_verts), source_indices) graph_source = graph[:, vert_is_source] graph_source = graph_source[vert_is_source, :] dist_matrix = dijkstra(csgraph=graph_source, directed=False) return dist_matrix ``` We identify the indices of vertices in our source region V1 and compute the distance matrices for each hemisphere separately to exclude connections between the hemispheres. We combine both matrices and set inter-hemisphere distances to infinity indicating that connective fields cannot map across hemispheres. ```{code-cell} ipython3 idx_v1_left = np.where(atlas["left"] == label_v1)[0] idx_v1_right = np.where(atlas["right"] == label_v1)[0] dist_matrix_lh = calculate_distance_matrix_dijkstra("data/hcp_999999/surfaces/wm_lh.gii", idx_v1_left) dist_matrix_rh = calculate_distance_matrix_dijkstra("data/hcp_999999/surfaces/wm_rh.gii", idx_v1_right) # We pad both matrices with infinity values and then concatenate dist_matrix = np.concatenate([ np.pad(dist_matrix_lh, [(0, 0), (0, dist_matrix_rh.shape[1])], constant_values=np.inf), np.pad(dist_matrix_rh, [(0, 0), (dist_matrix_lh.shape[1], 0)], constant_values=np.inf), ]) dist_matrix.shape ``` We can plot the final distance matrix. ```{code-cell} ipython3 fig, ax = plt.subplots(1, 1, figsize=(8, 6)) im = ax.imshow( dist_matrix, cmap="inferno", ) ax.set_xlabel("Vertex index (V1)") ax.set_ylabel("Vertex index (V1)") fig.colorbar(im, ax=ax, label="Distance (in mm)"); ``` We can see that the distances vary across vertices according to a regular pattern. This is due to the regular position of vertices on the surface mesh. We combine the distance matrix and the response in the source region in a {py:class}`prfmodel.stimuli.CFStimulus` object that we use as input for our CF model. ```{code-cell} ipython3 from prfmodel.stimuli import CFStimulus stimulus = CFStimulus(distance_matrix=dist_matrix, source_response=response_psc_v1) print(stimulus) ``` ## Defining the CF model Now that we have our BOLD response data and stimulus in place, we can create a CF model to *predict* a response to this stimulus. We use the standard Gaussian CF model that is described in Haak et al. (2013): It assumes that the stimulus (a vertex in the distance matrix) elicits a response that follows a Gaussian shape in two-dimensional cortical space. A baseline and amplitude parameter shift and scale the predicted response to match the observed BOLD response. The {py:class}`prfmodel.models.gaussian.GaussianCFModel` class performs these steps to make a combined prediction. ```{code-cell} ipython3 import pandas as pd from prfmodel.models.gaussian import GaussianCFModel cf_model = GaussianCFModel() ``` The {py:class}`prfmodel.models.gaussian.GaussianCFModel` class requires two core parameters: `center_index` indicates the location of the CF in the source region. With the current CF implementation in prfmodel, this must be a vertex index in the distance matrix/source region. This means that the center of a CF must be a vertex on the surface mesh (i.e., it cannot be in between vertices). We generate predictions for a range of `center_index` values to visualize how the predicted response changes across different CF locations. ```{code-cell} ipython3 # Generate predictions for a subset of center_index values preview_center_indices = np.arange(0, response_psc_v1.shape[0], 10) preview_params = pd.DataFrame({ "center_index": preview_center_indices.astype(float), "sigma": 1.0, "baseline": 0.0, "amplitude": 1.0, }) preview_predictions = np.asarray(cf_model(stimulus, preview_params)) fig = px.line( preview_predictions.T, animation_frame="variable", range_x=(0, 120), range_y=(-2, 2), labels={ "index": "Time frame (in TR)", "value": "BOLD response (in PSC)", "variable": "Center index", }, title="Predicted CF response for different center indices", ) fig.update_layout(showlegend=False, height=450) fig.show() ``` ## Fitting the CF model Before we start fitting the CF model to the BOLD timecourse, we need to filter out vertices with missing data, that is, vertices containing `NaN` values. We filter out all vertices that have missing BOLD response measurements. ```{code-cell} ipython3 response_is_valid = np.all(np.isfinite(response_psc), axis=1) response_valid = response_psc[response_is_valid] is_v1_valid = is_v1[response_is_valid] ``` We will fit the CF model to our BOLD response data using two stages. We begin with a grid search using {py:class}`prfmodel.fitters.grid.GridFitter` to find good values for our parameters of interest (`center_index`, and `sigma`). Then, we use {py:class}`prfmodel.fitters.linear.LeastSquaresFitter` to estimate the `baseline` and `amplitude` of our model. > **Note**: Typically, we would also use stochastic gradient descent (SGD) to finetune our model fits after the least-squares stage. To keep the example computationally simple, we omit this step here but encourage readers to explore this approach at the end. > Importantly, we can only optimize continuous parameters with SGD, so `center_index` must be a fixed value. +++ Let's start with the grid search by defining ranges of `center_index` and `sigma` that we want to construct a grid of parameter values from. For `baseline` and `amplitude`, we only provide a single value so that they will stay constant across the entire grid. ```{code-cell} ipython3 param_ranges = { "center_index": np.arange(dist_matrix.shape[0]), # We test every vertex in the source region "sigma": np.linspace(0.005, 20.0, 10), "baseline": [0.0], "amplitude": [1.0], } ``` The grid search will evaluate all possible combinations of the defined ranges and return the combination that fits the simulated data best. Note that this is still a relatively small grid and we recommend specifying finer grids in practice. Let's construct the {py:class}`prfmodel.fitters.grid.GridFitter` and perform the grid search. Note that we set `batch_size=20` to let the {py:class}`prfmodel.fitters.grid.GridFitter` evaluate 20 parameter combinations at the same time (which saves us some memory). As the `loss` (i.e., the metric to minimize between model predictions and data), we use cosine similarity which ignores differences in scale between model predictions and observed data. ```{code-cell} ipython3 from keras.losses import CosineSimilarity from prfmodel.fitters.grid import GridFitter # Create grid fitter object grid_fitter = GridFitter( model=cf_model, stimulus=stimulus, loss=CosineSimilarity(reduction="none"), # Grid fitter needs no aggregation of loss ) # Run grid search grid_history, grid_params = grid_fitter.fit( data=response_valid, parameter_values=param_ranges, batch_size=20, ) ``` ```{code-cell} ipython3 grid_params ``` We can see that the estimates for `center_index` and `sigma` are one combination in our grid. In the second step, we also optimize the `amplitude` of the pRF model together with the `baseline` using least squares. This adjusts the scale of our model predictions to scale the observed data. ```{code-cell} ipython3 from prfmodel.fitters.linear import LeastSquaresFitter # Create least-squares fitter ls_fitter = LeastSquaresFitter( model=cf_model, stimulus=stimulus, ) # Run least squares fit ls_history, ls_params = ls_fitter.fit( data=response_valid, parameters=grid_params, slope_name="amplitude", # Names of parameters to be optimized with least squares intercept_name="baseline", batch_size=200, ) ``` ```{code-cell} ipython3 ls_params ``` We can see that the amplitudes are substantially lower compared to the starting value (and our initial guess) for many vertices. Now that the core CF parameters and the auxiliary baseline and amplitude parameters are optimized, we can compare the model predictions against the observed responses. Because we want to make predictions for all vertices in the brain, we wrap our `cf_model` in the {py:func}`prfmodel.utils.batched` modifier function. The modifier changes the behavior of the model to make predictions for batches of vertices sequentially. This saves us a lot of memory at the expense of minimal runtime overhead. ```{code-cell} ipython3 from prfmodel.utils import batched predict_batched = batched(cf_model) # Make predictions with optimized parameters pred_response = np.asarray(predict_batched(stimulus, ls_params, batch_size=200)) ``` We can quantify how well the predictions align with the observed timecourses using the R-squared metric. This metric indicates the proportion of variance in the observed data explained by our model predictions. ```{code-cell} ipython3 from keras.metrics import R2Score r2_metric = R2Score(class_aggregation=None) # Don't aggregate score over vertices r_squared = np.asarray(r2_metric(response_valid.T, pred_response.T)) # Transpose to compute score across time frames r_squared.shape ``` We can look at the distribution of R-squared values across vertices. ```{code-cell} ipython3 fig = px.histogram( x=r_squared, nbins=15, labels={"x": "R-squared"}, ).update_layout(yaxis_title="Frequency", height=450) fig.show() ``` We can see that many vertices have a score above zero meaning that the response of the CF model matches the observed response. A substantial amount of vertices also have higher scores, suggesting that the CF model successfully maps different brain regions to the source region V1. Note that some vertices have a perfect score of $\approx 1$. These are the vertices in V1 which are perfectly mapped onto themselves. Let's plot the predicted and the observed timecourses for a subsample of vertices. ```{code-cell} ipython3 import plotly.graph_objects as go each_k = 500 # Sort vertices that are not in V1 according to R-squared best_vertices = np.flip(np.argsort(r_squared[~is_v1_valid]))[::each_k] response_valid_best_vertices = response_valid[~is_v1_valid][best_vertices] pred_response_best_vertices = pred_response[~is_v1_valid][best_vertices] r_squared_valid_best_vertices = r_squared[~is_v1_valid][best_vertices] sigma_best_vertices = ls_params["sigma"].values[~is_v1_valid][best_vertices] df_valid = pd.DataFrame(response_valid_best_vertices.T) df_pred = pd.DataFrame(pred_response_best_vertices.T) df_valid["source"] = "Observed" df_pred["source"] = "Predicted" df = pd.concat([df_valid, df_pred], axis=0) df["time"] = np.tile(np.arange(df_valid.shape[0]), 2) df_melted = df.melt(id_vars=["source", "time"], var_name="vertex", value_name="response") fig = px.line( df_melted, x="time", y="response", color="source", animation_frame="vertex", range_y=[-5, 5], labels={ "time": "Time frame (in TR)", "response": "BOLD response (in PSC)", "vertex": "Vertex", "source": "", }, title="Observed and predicted vertex responses (subsampled vertices)", ) # Add a text trace to display per-vertex stats; this will be updated in each animation frame fig.add_trace(go.Scatter( x=[60], y=[4.7], mode="text", text=[f"R-squared = {r_squared_valid_best_vertices[0]:.3f}, sigma = {sigma_best_vertices[0]:.3f}"], showlegend=False, hoverinfo="skip", textfont=dict(size=13), )) stats_trace_idx = len(fig.data) - 1 # Append a stats text update to each animation frame's trace data for i, frame in enumerate(fig.frames): r2 = r_squared_valid_best_vertices[i] sigma = sigma_best_vertices[i] frame.data = list(frame.data) + [go.Scatter( text=[f"R-squared = {r2:.3f}, sigma = {sigma:.3f}"], )] fig.update_layout(showlegend=True, height=450) fig.show() ``` We can see that the model predicts the observed timecourses for these vertices very well. We can also visualize the R-squared score on the flat surface mesh. ```{code-cell} ipython3 r_squared_full = np.full((response_psc.shape[0],), fill_value=np.nan) r_squared_full[response_is_valid] = r_squared plot_surf_stat_map_helper(r_squared_full, vmin=0.0, vmax=1.0, title="Variance explained (R-squared)"); ``` We can see that the vertices with the highest scores are located in the visual areas of both hemispheres. This means that there is high connectivity between higher areas in the visual pathway and V1. ## Analyzing the CF results To analyze and interpret the CF parameters, we will zoom in on vertices with R-squared > 0.6 which is roughly 10% of the vertices in the brain (note that this threshold is somewhat arbitrary). ```{code-cell} ipython3 # Create mask for vertices above R-squared threshold is_above_threshold = r_squared > 0.6 # Compute proportion of vertices above threshold is_above_threshold.mean() ``` We look at the CF size (in mm) indicated by `sigma` and plot it on the surface. ```{code-cell} ipython3 # Allocate size vector for the entire surface size = np.full((response_psc.shape[0],), fill_value=np.nan) # Fill valid vertices with estimated size parameters size[response_is_valid] = np.where(is_above_threshold, ls_params["sigma"], np.nan) plot_surf_stat_map_helper(size, vmin=0.0, vmax=20.0, title="CF size (sigma, in mm)"); ``` We can see that vertices in higher areas in the visual pathway have larger CF sizes. +++ ## Conclusion This example showed how to fit a Gaussian CF model to empirical fMRI data collected from an experiment in the visual domain. First, we plotted the raw BOLD response data on the cortical surface. Second, we calculated a distance matrix from a white matter surface using Dijkstra's algorithm. Then, we defined a CF model and optimized its parameters using a grid search followed by least-squares to adjust for baseline and amplitude differences. Finally, we visualized model fit, the estimated parameters, and derived measures on the cortical surface. ## Next steps The predictions by our CF model can potentially be improved. We suggest different directions for improving the pRF model fit: - Increasing the number of points in the parameter grid for the grid search - Finetuning `sigma` with stochastic gradient descent with {py:class}`prfmodel.fitters.sgd.SGDFitter` - Applying preprocessing steps before fitting the pRF model (e.g., high-pass filtering) - Building a more complex CF model (e.g., difference of Gaussian, see Zuiderbaan et al., 2012) +++ ## Stay Tuned More tutorials on fitting models to empirical data and creating custom models are in the making. For questions and issues, please make an issue on [GitHub](https://github.com/popylar-org/prfmodel/issues) or contact Malte Lüken (m.luken@esciencecenter.nl). +++ ## References Glasser, M. F., Coalson, T. S., Robinson, E. C., Hacker, C. D., Harwell, J., Yacoub, E., Ugurbil, K., Andersson, J., Beckmann, C. F., Jenkinson, M., Smith, S. M., & Van Essen, D. C. (2016). A multi-modal parcellation of human cerebral cortex. *Nature, 536*(7615), 171–178. https://doi.org/10.1038/nature18933 Haak, K. V., Winawer, J., Harvey, B. M., Renken, R., Dumoulin, S. O., Wandell, B. A., & Cornelissen, F. W. (2013). Connective field modeling. *NeuroImage, 66*, 376–384. https://doi.org/10.1016/j.neuroimage.2012.10.037 Van Essen, D. C., Ugurbil, K., Auerbach, E., Barch, D., Behrens, T. E. J., Bucholz, R., Chang, A., Chen, L., Corbetta, M., Curtiss, S. W., Della Penna, S., Feinberg, D., Glasser, M. F., Harel, N., Heath, A. C., Larson-Prior, L., Marcus, D., Michalareas, G., Moeller, S., … WU-Minn HCP Consortium. (2012). The Human Connectome Project: A data acquisition perspective. *NeuroImage, 62*(4), 2222–2231. https://doi.org/10.1016/j.neuroimage.2012.02.018 Zuiderbaan, W., Harvey, B. M., & Dumoulin, S. O. (2012). Modeling center–surround configurations in population receptive fields using fMRI. *Journal of Vision, 12*(3), 10. https://doi.org/10.1167/12.3.10