CSC320 — Introduction to Visual Computing, Spring 2024
Assignment 4: The PatchMatch Algorithm
Posted: Friday, November 15, 2024
Due: 11:59pm, Monday, Dec 2, 2024 (this is NOT the usual Thursday due date)
Late policy: 15% marks deduction per 24hrs, submission not accepted if > 5 days late
In this assignment you will implement the PatchMatch algorithm. This algorithm is described in a paper by Barnes et al. and will be discussed in tutorial this week. Your specific task is to complete the technique’s implementation in the starter code. The starter code is based on OpenCV and is supposed to be executed from the command line, not via a Kivy GUI.
Figure 1: Example output from Barnes et al. The top left image is reconstructed entirely from image patches from the bottom left image. From left to right we can see the algorithm’s progression from random initialization to near-perfect reconstruction.
Goals: The goals of the assignment are to (1) get you familiar with reading and understanding a research paper and (partially) implementing the technique it describes; (2) learn how to implement a more advanced algorithm for efficient image processing; and (3) understand how the inefficiencies you experienced in matching patches in the previous assignment can be overcome with a randomized technique. This algorithm has since become the workhorse of a number of image manipulation operations and was a key component of Adobe’s Content-Aware Fill feature.
Important: As in the previous assignments, you are advised to start immediately by reading the paper (see below). The next step is to run the starter code, and compare it to the output of the reference implementation. Unlike Assignment 3 where you implemented a couple of small functions called by an already-implemented algorithmic main loop, here your task will be to implement the algorithm itself. This requires a much more detailed understanding of the algorithm as well as pitfalls that can afect correctness, efficiency, etc. Expect to spend a fair amount of time implementing after you’ve understood what you have to do.
Testing your implementation: Unlike Assignment 3, it is possible to develop and run your code remotely from Teaching Lab machines.
Starter code & the reference solution
Use the following sequence of commands to unpack the starter code and to display its input argu- ments:
cd ~
tar xvfz patchmatch.tar.gz
rm patchmatch.tar.gz
cd patchmatch/code
python viscomp.py --help
See Sections 2 and 5 for details on how the code is structured and for guidelines about how to navigate it. Its structure should be familiar by now as it is similar to previous assignments. This code was lasted tested and compiled on the teach .cs server using python=3 .11. In addition to the starter code, I am providing the output of the reference solution for a pair of test images, along with input parameters and timings.
I am also providing a fully-featured reference solution in an encrypted format (with sourcedefender) to see how your own implementation should behave, and to make sure that your implementation produces the correct output. That being said, you should not expect your implementation to produce exactly the same output as the reference solution as tiny diferences in implementation might lead to slightly diferent results. This is not a concern, however, and the TAs will be looking at your code as well as its output to make sure what you are doing is reasonable.
Please read the CHECKLIST .txt carefully. You will need to complete this form prior to submission, and this will be the first file markers look at when grading your assignment.
The PatchMatch Algorithm (100 Marks)
The technique is described in full detail in the following paper (available here):
C. Barnes, E. Shechtman, A. Finkelstein and D. B. Goldman, “PatchMatch: A Randomized Algo- rithm for Structural Image Editing,” Proc. SIGGRAPH 2009.
You should read Section 1 of the paper right away to get a general idea of the principles behind the method. The problem it tries to address should be familiar to you given that the algorithm you worked with in A3 relied on a “nearest-neighbor search” procedure for identifying similar patches for inpainting. In fact, Criminisi et al’s inpainting algorithm is cited in Section 2 of the paper as a motivation for PatchMatch. You should read Section 2 as well, mainly for context and background.
The algorithm you are asked to implement is described in full in Section 3. The algorithm’s ini- tialization, described in Section 3.1, has already been implemented. Your task is to implement the algorithm’s basic iteration as described in Section 3.2 up to, but not including, paragraph Halting criteria. The starter code uses the terminology of Eq. (1) to make it easier for you to follow along.
For those of you who are more theoretically minded and/or have an interest in computer science theory, it is worth reading Section 3.3. This section is not required for implementation but it does help explain why the algorithm works as well as it does.
Sections 4 and 5 of the paper describe more advanced editing tools that use PatchMatch as a key component. They are not required for your implementation, and Section 4 in particular requires a fair amount of background that you currently don’t have. Read these sections if you are interested in finding out all the cool things that you can do with PatchMatch.
Part 1. Programming Component (90 Marks)
You need to complete to implement the two functions detailed below. A skeleton of both is included in file patchmatch/code/algorithm.py. This file is where your entire implementation will reside.
In addition to these functions, you will need to copy a few lines of code from your A3 imple- mentation for image reading and writing that are not provided in the starter code. See the file patchmatch/code/README 1st .txt for details.
Part 1.1. The propagation and random search() function (65 Marks)
This function takes as input a source image, a target image, and a nearest-neighbor field f that assigns to each patch in the source the best-matching patch in the target. This field is initially quite poor, i.e., the target patch it assigns to each source patch is definitely not the most similar patch in the target. The goal of the function is to return a new nearest-neighbor field that improves these patch-to-patch correspondences. The function accepts a number of additional parameters that control the algorithm’s behavior. Details about them can be found in the starter code and in the paper itself.
As explained in the paper, the algorithm involves two interleaved procedures, one called random search (50 marks) and the other called propagation (15 marks). You must implement both, within the same function. You are welcome to use helper functions in your implementation but this is not necessary (the reference implementation does not).
The starter code provides two flags that allow you to disable propagation or random search in this function. As you develop your implementation, you can use these flags for debugging purposes, to isolate problems related to one or the other procedure.
Part 1.2. The reconstruct source from target() function (15 Marks)
This function re-creates the source image by copying pixels from the target image, as prescribed by the supplied nearest-neighbor field. If this field is of high quality, then the copied pixels will be almost identical to those of the source; if not, the reconstructed source image will contain artifacts. Thus, comparing the reconstructed source to the original source gives you an idea of how good a nearest-neighbor field is.
Details of the function’s input and output parameters are in the starter code.
Part 1.3. Efficiency considerations (10 Marks)
You should pay attention to the efficiency of the code you write. Explicit loops cannot be completely avoided in propagation and random search() but their use should be kept to an absolute minimum. This is necessary to keep the running time of your code to a reasonable level: the input images you are supplied are quite large and it will take a very long time to process them if you use too many loops. No explicit loops are needed in reconstruct source from target().
Solutions that are no more than 50% slower than the reference implementation will receive full marks for efficiency. Less efficient implementations will have some of those points deducted depending on how much they deviate from this baseline.
Part 2. Report and Experimental evaluation (10 Marks)
Your task here is to put the PatchMatch to the test by conducting your own experiments. Try it on a variety of pairs of photos; on two adjacent frames of video; on “stereo” image pairs taken by capturing a photo of a static scene and then adjusting your viewpoint slightly (e.g., afew centimeters) to capture a second photo from a diferent point of view. Basically, run it on enough image pairs to understand when it works well and when it doesn’t.
At the very least, you must show the results of running the algorithm on the supplied source/target image pairs, using command-line arguments like those specified in Section 3.
Your report should highlight your implementation’s results (nearest neighbor field, reconstructed source, etc) and discuss how well your algorithm performs, and the conditions in which it doesn’t work well. The more solid evidence (i.e., results) you can include in your report PDF to back up your arguments and explanations, the more likely you will get full marks on your report.
Place your report in file patchmatch/report/report.pdf. You may use any word processing tool to create it (Word, LATEX, Powerpoint, HTML, etc.) but the report you turn in must be in PDF format.
What to turn in: You will be submitting the completed CHECKLIST form, your code, your written report and images. Use the following sequence of commands to pack your code and results:
cd patchmatch
tar cvfz assign4.tar.gz code results report/report.pdf CHECKLIST.txt
Upload the gzipped tarfile to MarkUs.
Important: After uploading the tarfile to MarkUS, download it from MarkUS, unpack it and verify that your code and all other assignment components are there! Students in previous instances of the course have accidentally submitted the starter tarfile instead of their own code and results, leading to major issues with marking—and wasting a lot of time for all involved. Because of this, there will be no accommodation for such errors this term: if you submit the starter code instead of your own code and/or submit no results, you will receive a zero on those parts of the assignment; if you discover this error a day later and want to re-submit, the standard late submission policy will apply. As stated on the course website, the course’s late policy includes a brief grace period to allow you to quickly resubmit your tarfile a few minutes after the submission deadline without any penalty if you discover omissions in your submitted tarfile.