# Introduction to medical image processing with Python: CT lung and vessel segmentation without labels (code included)

Time for some hands-on tutorial on medical imaging. However, this time we will not use crazy AI but basic image processing algorithms. The goal is to familiarize the reader with concepts around medical imaging and specifically Computed Tomography (CT).

It is critical to understand how far one can go without deep learning, to understand when it’s best to use it. New practitioners tend to ignore that part, but medical image analysis is still 3D image processing.

I also include parts of the code to facilitate the understanding of my thought process.

The accompanying Google colab notebook can be found here to run the code shown in this tutorial. A Github repository is also available. Star our repo if you liked it!

To dive deeper into how AI is used in Medicine, you can’t go wrong with the AI for Medicine online course, offered by Coursera.

We will start with the very basics of CT imaging. You may skip this section if you are already familiar with CT imaging.

## CT imaging

### Physics of CT Scans

Computed Tomography (CT) uses X-ray beams to obtain 3D pixel intensities of the human body. A heated cathode releases high-energy beams (electrons), which in turn release their energy as X-ray radiation. X-rays pass through human body tissues and hits a detector on the other side. A dense tissue (i.e. bones) will absorb more radiation than soft tissues (i.e. fat). When X-rays are not absorbed from the body (i.e. in the air region inside the lungs) and reach the detector we see them as black, similar to a black film. On the opposite, dense tissues are depicted as white.

In this way, CT imaging is able to distinguish density differences and create a 3D image of the body.

Here is a 1 min video I found very concise:

### CT intensities and Hounsfield units

The X-ray absorption is measured in the Hounsfield scale. In this scale, we fix the Air intensity to -1000 and water to 0 intensity. It is essential to understand that Housenfield is an absolute scale, unlike MRI where we have a relative scale from 0 to 255.

The image illustrates some of the basic tissues and their corresponding intensity values. Keep in mind that the images are noisy. The numbers may slightly vary in real images.

The Hounsfield scale. Image by Author.

Bones have high intensity. We usually clip the image to have an upper maximum range. For instance, the max value might be 1000, for practical reasons.

The problem: visualization libraries work on the scale [0,255]. It wouldn’t be very wise to visualize all the Hounsfield scale (from -1000 to 1000+ ) to 256 scales for medical diagnosis.

Instead, we limit our attention to different parts of this range and focus on the underlying tissues.

### CT data visualization: level and window

The medical image convention to clip the Housenfield range is by choosing a central intensity, called level and a window, as depicted:

It is actually quite an ugly convention for computer scientists. We would just like the min and max of the range:

$max = level + window/2$ $min = level - window/2$

import matplotlib.pyplot as plt
import numpy as np
def show_slice_window(slice, level, window):
"""
Function to display an image slice
Input is a numpy 2D array
"""
max = level + window/2
min = level - window/2
slice = slice.clip(min,max)
plt.figure()
plt.imshow(slice.T, cmap="gray", origin="lower")
plt.savefig('L'+str(level)+'W'+str(window))



If you are not convinced, the next image will convince you that the same CT image is more rich than a common image channel:

Window and leveling in CT can give you quite different images. Image by Author.

For reference here is a list of visualization ranges:

 Region/Tissue Window Level brain 80 40 lungs 1500 -600 liver 150 30 Soft tissues 250 50 bone 1800 400

Time to play!

## Lung segmentation based on intensity values

We will not just segment the lungs but we will also find the real area in $$mm^2$$. To do that we need to find the real size of the pixel dimensions. Each image may have a different one (pixdim in the nifty header file). Let’s see the header file first:

import nibabel as nib


Here I will show only some important fields of the header:

<class 'nibabel.nifti1.Nifti1Header'> object, endian='<'
sizeof_hdr      : 348
dim             : [  2 512 512   1   1   1   1   1]
datatype        : int16
bitpix          : 16
pixdim          : [1.   0.78515625 0.78515625   1.   1.   1.    1.   1.  ]
srow_x          : [ -0.78515625   0.           0.         206.60742   ]
srow_y          : [  0.          -0.78515625   0.         405.60742   ]
srow_z          : [   0.     0.     1.  -304.5]


For the record, srow_x, srow_y, srow_z is the affine matrix of the image. Bitpix is how many bits we use to represent each pixel intensity.

So let’s define a function that reads that information from the header file. Based on the nifty format each dimension in the nifty file has a pixel dimension. What we need is to find out the 2 dimension indices of the image and their respective pix dimensions.

### Step 1: Find pixel dimensions to calculate the area in mm^2

def find_pix_dim(ct_img):
"""
Get the pixdim of the CT image.
A general solution that gets the pixdim indicated from the image dimensions. From the last 2 image dimensions, we get their pixel dimension.
Args:
ct_img: nib image

Returns: List of the 2 pixel dimensions
"""
pix_dim = ct_img.header["pixdim"] # example [1,2,1.5,1,1]
dim = ct_img.header["dim"] # example [1,512,512,1,1]
max_indx = np.argmax(dim)
pixdimX = pix_dim[max_indx]
dim = np.delete(dim, max_indx)
pix_dim = np.delete(pix_dim, max_indx)
max_indy = np.argmax(dim)
pixdimY = pix_dim[max_indy]
return [pixdimX, pixdimY] # example [2, 1.5]


### Step 2: Binarize image using intensity thresholding

We expect lungs to be in the Housendfield unit range of [-1000,-300]. To this end, we need to clip the image range to [-1000,-300] and binarize the values to 0 and 1, so we will get something like this:

Image by Author

### Step 3: Contour finding

Let’s clarify what is a contour before anything else:

For computer vision, a contour is a set of points that describe a line or area. So for each detected contour we will not get a full binary mask but rather a set with a bunch of x and y values.

Ok, how can we isolate the desired area? Hmmm.. let’s think about it. We care about the lung regions that are shown on white. If we could find an algorithm to identify close sets or any kind of contours in the image that may help. After some search online, I found the marching squares method that finds constant valued contours in an image from skimage, called skimage.measure.find_contours().

After using this function I visualize the detected contours in the original CT image:

Image by Author

Here is the function!


def intensity_seg(ct_numpy, min=-1000, max=-300):
clipped = clip_ct(ct_numpy, min, max)
return measure.find_contours(clipped, 0.95)



### Step 4: Find the lung area from a set of possible contours

Note that I used a different image to show an edge case that the patient’s body is not a closed set of points. Ok, not exactly what we want, but let’s see if we could work that out.

Since we only care about the lungs we have to set some sort of constraints to exclude the unwanted regions.

To do so, I first extracted a convex polygon from the contour using scipy. After I assume 2 constraints:

• The contour of the lungs must be a closed set (always true)

• The contour must have a minimum volume of 2000 pixels to represent the lungs.

That may or may not include the body contour, resulting in more than 3 contours. When that happens the body is easily discarded by having the largest volume of the contour that satisfies the pre-described assumptions.

def find_lungs(contours):
"""
Chooses the contours that correspond to the lungs and the body
First, we exclude non-closed sets-contours
Then we assume some min area and volume to exclude small contours
Then the body is excluded as the highest volume closed set
The remaining areas correspond to the lungs
Args:
contours: all the detected contours

Returns: contours that correspond to the lung area
"""
body_and_lung_contours = []
vol_contours = []

for contour in contours:
hull = ConvexHull(contour)

# set some constraints for the volume
if hull.volume > 2000 and set_is_closed(contour):
body_and_lung_contours.append(contour)
vol_contours.append(hull.volume)

if len(body_and_lung_contours) == 2:
return body_and_lung_contours
elif len(body_and_lung_contours) > 2:
vol_contours, body_and_lung_contours = (list(t) for t in
zip(*sorted(zip(vol_contours, body_and_lung_contours))))
body_and_lung_contours.pop(-1) # body is out!
return body_and_lung_contours # only lungs left !!!


As an edge case, I am showing that the algorithm is not restricted to only two regions of the lungs. Inspect the blue contour below:

Image by Author

### Step 5: Contour to binary mask

Next, we save it as a nifty file so we need to convert the set of points to a lung binary mask. For this, I used the pillow python lib that draws a polygon and creates a binary image mask. Then I merge all the masks of the already found lung contours.

import numpy as np
from PIL import Image, ImageDraw
"""
Creates a binary mask with the dimensions of the image and
converts the list of polygon-contours to binary masks and merges them together
Args:
image: the image that the contours refer to
contours: list of contours

Returns:

"""
for contour in contours:
x = contour[:, 0]
y = contour[:, 1]
polygon_tuple = list(zip(x, y))
img = Image.new('L', image.shape, 0)
ImageDraw.Draw(img).polygon(polygon_tuple, outline=0, fill=1)

lung_mask[lung_mask > 1] = 1  # sanity check to make 100% sure that the mask is binary
return lung_mask.T  # transpose it to be aligned with the image dims


The desired lung area in $$mm^2$$ is simply the number of nonzero elements multiplied by the two pixel dimensions of the corresponding image.

The lung areas are saved in a csv file along with the image name.

Finally, to save the mask as nifty I used the value of 255 for the lung area instead of 1 to be able to display in a nifty viewer. Moreover, I save the image with the affine transformation of the initial CT slice to be able to be displayed meaningfully (aligned without any rotation conflicts).

def save_nifty(img_np, name, affine):
"""
binary masks should be converted to 255 so it can be displayed in a nii viewer
we pass the affine of the initial image to make sure it exits in the same
image coordinate space
Args:
name: output name
affine: 4x4 np array
Returns:
"""
img_np[img_np == 1] = 255
ni_img = nib.Nifti1Image(img_np, affine)
nib.save(ni_img, name + '.nii.gz')


Finally, I opened the mask with a common nifty viewer for Linux to validate that everything went ok. Here are snapshots for slice number 4:

Image by Author

I used a free medical imaging viewer called Aliza on Linux.

## Segment the main vessels and compute the vessels over lung area ratio

If there is a pixel with an intensity value over -500 HU inside the lung area then we will consider it as a vessel.

First, we do element-wise multiplication between the CT image and the lung mask to get only the lungs. Afterwards, we set the zeros that resulted from the element-wise multiplication to -1000 (AIR in HU) and finally keep only the intensities that are bigger than -500 as vessels.

def create_vessel_mask(lung_mask, ct_numpy, denoise=False):
vessels = lung_mask * ct_numpy  # isolate lung area
vessels[vessels == 0] = -1000
vessels[vessels >= -500] = 1
vessels[vessels < -500] = 0
show_slice(vessels)
if denoise:
return denoise_vessels(lungs_contour, vessels)
show_slice(vessels)
return vessels


One sample of this process can be illustrated below:

The vessel mask with some noise. Image by Author.

### Analyzing and improving the segmentation’s result

As you can see we have some parts of the contour of the lungs, which I believe we would like to avoid. To this end, I created a denoising function that considers the distance of the mask to all the contour points. If it is below 0.1, I set the pixel value to 0 and as a result exclude them from the detected vessels.

def denoise_vessels(lung_contour, vessels):
vessels_coords_x, vessels_coords_y = np.nonzero(vessels)  # get non zero coordinates
for contour in lung_contour:
x_points, y_points = contour[:, 0], contour[:, 1]
for (coord_x, coord_y) in zip(vessels_coords_x, vessels_coords_y):
for (x, y) in zip(x_points, y_points):
d = euclidean_dist(x - coord_x, y - coord_y)
if d <= 0.1:
vessels[coord_x, coord_y] = 0
return vessels


Below you can see the difference between the denoise image on the right and the initial mask:

Image by Author

If we overlay the mask in the original CT image we get:

Image by Author

def overlay_plot(im, mask):
plt.figure()
plt.imshow(im.T, 'gray', interpolation='none')


Now that we have the mask, the vessel area is computed similar to what I did for the lungs, by taking into account the individual image pixel dimension.

def compute_area(mask, pixdim):
"""
Computes the area (number of pixels) of a binary mask and multiplies the pixels
with the pixel dimension of the acquired CT image
Args:
pixdim: list or tuple with two values

Returns: the lung area in mm^2
"""
return lung_pixels * pixdim[0] * pixdim[1]


The ratios are stored in a csv file in the notebook.

I believe that now you have a solid understanding of CT images and their particularities. We can do a lot of awesome things with such rich information of 3D images.

Support us by staring our project on GitHub. Finally, for more advanced tutorials see our medical imaging articles.

I have been asked for a large hands-on AI course. I am thinking of writing a book on medical imaging in 2021. Until then you can learn from the coursera course AI for Medicine.

AI Summer is committed to protecting and respecting your privacy, and we’ll only use your personal information to administer your account and to provide the products and services you requested from us. From time to time, we would like to contact you about our products and services, as well as other content that may be of interest to you. If you consent to us contacting you for this purpose, please tick below to say how you would like us to contact you.

You can unsubscribe from these communications at any time. For more information on how to unsubscribe, our privacy practices, and how we are committed to protecting and respecting your privacy, please review our Privacy Policy. By clicking submit below, you consent to allow AI Summer to store and process the personal information submitted above to provide you the content requested.

* Please note that some of the links above might be affiliate links, and at no additional cost to you, we will earn a commission if you decide to make a purchase after clicking through the link.