代做COMP9517: Computer Vision 2024 Term 3代做Python程序

2024-11-10 代做COMP9517: Computer Vision 2024 Term 3代做Python程序

COMP9517: Computer Vision

2024 Term 3

Group Project Specification

Maximum Marks Achievable: 40

The group project is worth 40% of the total course mark.

Project work is in Weeks 6-10 with deliverables due in Week 10.

Deadline for submission is Friday 15 November 2024 18:00:00 AET.

Instructions for online submission will be posted closer to the deadline.

Refer to the separate marking criteria for detailed information on marking.

Introduction

The goal of the group project is to work together with peers in a team of 5 students to solve a computer vision problem and present the solution in both oral and written form.

Group members can meet with their assigned tutors once per week in Weeks 6-10 during the usual consultation session hours to discuss progress and get feedback.

The group project is to be completed by each group separately. Do not copy ideas or any materials from other groups. If you use publicly available methods or software for some of the steps, these must be properly attributed/referenced. Failing to do so is plagiarism and will be penalised according to UNSW rules described in the Course Outline.

You are expected to show creativity and build on ideas taught in the course or from computer vision literature. High marks will be given only to groups that developed methods not used before for the project task. We do not expect you to develop everything from scratch, but the more you use existing code (which will be checked), the lower the mark.

Description

Recognizing  individual  animals  from  photographs  is  important  for  many  wildlife  studies, including population monitoring, behaviour analysis, and management.  This is often done by expert manual analysis of the photographs, in particular segmentation of the relevant body parts of the animals, which is very labour-intensive. The continuing collection and expansion of  image  datasets  spanning  multiple  years  of  observation  is  causing  a  growing  need  for computer vision methods to automate the task.

Task

The goal of this group project is to develop and compare different computer vision methods for segmenting sea turtles from photographs. More specifically, the task is to segment the head, flippers, and the carapace of each turtle.

Dataset

The dataset to be used in this group project is called SeaTurtleID2022and is available from Kaggle (see references with links at the end of this document). It contains 8,729 photographs of 438 unique sea turtles collected over 13 years in 1,221 encounters, making it the longest- spanned dataset for animal reidentification.

Each  photograph  comes  with  annotations  such  as  identities,  encounter  timestamps,  and segmentation masks of the body parts. Further details are provided in the paper associated with the dataset(see references). On WebCMS3 we provide a Jupyter notebook showing how to load the turtle photographs and corresponding annotations.

Methods

Many traditional, machine learning, and deep learning-based computer vision methods could be used for this task. You are challenged to use concepts taught in the course and other techniques from literature to develop your own methods and test their performance.

At  least  two  different  methods  must  be  developed  and  tested.  For  example,  you  could compare one machine learning-based method and one more traditional method, or two deep learning-based methods using different neural network architectures.

Although we do not expect you to develop everything from scratch, we do expect to see some new combination of techniques, or modifications of existing ones, or the use of more state- of-the-art methods that have not been tried before for sea turtle segmentation.

As there are virtually infinitely many possibilities here, it is impossible to give detailed criteria, but as a general guideline, the more you develop things yourself rather than copy straight from elsewhere, the better. In any case, always do cite your sources.

Training

If your  methods  require  training  (that  is,  if  you  use  supervised  rather  than  unsupervised segmentation approaches), you must ensure that the training images are not also used for testing. Even if your methods do not require training, they may have hyperparameters that you need to fine-tune to get optimal performance. In that case, too, you must use the training set, not the test set. Using (partly) the same data for both training/fine-tuning and testing leads to biased results that are not representative of real-world performance.

Specifically, you must use open-set splitting of the dataset into training, validation, and test subsets, as defined by the creators in the metadata of the SeaTurtleID2022 dataset. In their paper (see references below) the creators explain that open-set splitting gives a much more realistic performance evaluation than closed-set or random splitting.

Testing

To assess the performance of your methods, compare the segmented regions quantitatively with the corresponding manually annotated (labelled) masks by calculating the intersection over union (IoU), also known as the Jaccard similarity coefficient (JSC).

That is, for each photograph in the test set, compute the IoU of the predicted head segment with the corresponding head segment in the annotation, and similarly for the flippers and the carapace (“turtle”). Then compute the mean IoU (mIoU) over the whole test set for each of the three categories separately (head mIoU, flippers mIoU, and carapace mIoU).

Show  these   quantitative   results   in   your  video   presentation   and   written   report   (see deliverables below). Also show representative examples of successful segmentations as well as examples where your methods failed (no method generally yields perfect results). Give some explanation why you believe your methods failed in these cases.

Furthermore, if one of your methods clearly works better than the other, discuss possible reasons why. And, finally, discuss some potential directions for future research to further improve the segmentation performance for this dataset.

Practicalities

The SeaTurtleID2022 dataset is less than 2 GB in total, so method training and testing should be feasible  on  a  modern  desktop  or  laptop  computer.  If  more  computing  resources  are needed, you could consider using the free version ofGoogle Colab. Otherwise, you are free to use  only  a  subset  of  the  data,  for  example  75%  or  50%.  Of  course,  you  can  expect  the performance of your methods to go down accordingly, but as long as you clearly report your approach, this will not negatively impact your project mark.

Deliverables

The deliverables of the group project are 1) a video presentation, 2) a written report, and 3) the code. The deliverables are to be submitted by only one member of the group, on behalf of the whole group (we do not accept submissions from  multiple group  members).  More detailed information on the deliverables:

Video

Each group must prepare a video presentation of at most 10 minutes showing their work. The presentation  must  start  with  an  introduction  of  the  problem  and  then  explain  the  used methods, show the obtained  results, and discuss these  results as well as  ideas for future improvements.  For  this  part  of  the  presentation,  use  PowerPoint  slides  to  support  the narrative.  Following  this  part,  the  presentation  must  include  a  demonstration  of  the methods/software in action. Of course, some methods may take a long time to compute, so you may record a live demo and then edit it to stay within time.

The entire presentation must be in the form. of a video (720p or 1080p MP4 format) of at most 10 minutes (anything beyond that will be ignored). All group members must present (points may be deducted if this is not the case), but it is up to you to decide who presents which part (introduction, methods, results, discussion, demonstration). In order for us to verify that all group members are indeed presenting, each student presenting their part must be visible in a corner  of  the  presentation  (live  recording,  not  a  static  head  shot),  and  when they  start presenting, they must mention their name.

Overlaying  a  webcam  recording  can  be  easily  done  using  either  the  video  recording functionality  of  PowerPoint  itself  (see  for  example this  YouTube  tutorial) or  using  other recording software such asOBS Studio,Camtasia,Adobe Premiere, and many others. It is up to you (depending on your preference and experience) which software to use, as long as the final video satisfies the requirements mentioned above.

Also note that video files can easily become quite large (depending on the level of compression used). To avoid storage problems for this course, the video upload limit will be 100 MB per group, which should be more than enough for this type of presentation. If your video file is larger, use tools likeHandBraketo re-encode with higher compression.

The video presentations will be marked offline (there will be no live presentations). If the markers have any concerns or questions about the presented work, they may contact the group members by email for clarification.

Report

Each group must also submit a written report (in2-column IEEE format, max. 10 pages of main text, and in addition any number of references).

The report must be submitted as a PDF file and include:

1. Introduction : Discuss your understanding of the task specification and dataset.

2. Literature  Review :  Review  relevant  techniques  in  literature,  along  with  any  necessary background to understand the methods you selected.

3. Methods:  Motivate  and  explain the selection  of the  methods you  implemented, using relevant references and theories where necessary.

4. Experimental Results: Explain the experimental setup you used to test the performance of the developed methods and the results you obtained.

5. Discussion:  Provide  a  discussion  of  the  results and  method  performance,  in  particular reasons for any failures of the method (if applicable).

6. Conclusion: Summarise what worked / did not work and recommend future work.

7. References:  List  the  literature  references  and  other  resources  used  in  your  work.  All external  sources  (including  websites)   used  in  the   project   must  be  referenced.  The references section does not count toward the 10-page limit.

Code

The complete source code of the developed software must be submitted as a ZIP file and, together with the video and report, will be assessed by the markers. Therefore, the submission must include all necessary modules/information to easily run the code. Software that is hard to run or does not produce the demonstrated results will result in deduction of points. The upload limit for the source code (ZIP) plus report (PDF) together will be 100 MB. Note that this upload limit is separate from the video upload limit (also 100 MB).

Plagiarism detection software will  be  used to screen all submitted  materials  (reports  and source codes). Comparisons will be made not only pairwise between submissions, but also with related assignments in previous years (if applicable) and publicly available materials. See the Course Outline for the UNSW Plagiarism Policy.

Student Contributions

As a group, you are free in how you divide the work among the group members, but all group members must contribute roughly equally to the method development, coding, making the video, and writing the report. For example, it is unacceptable if some group members only prepare the video and report without contributing to the methods and code.

An  online  survey  will  be  held  at  the  end  of  the  term  allowing  students  to  anonymously evaluate the relative contributions of their group members to the project. The results will be reported only to the LIC and the Course Administrators, who at their discretion may moderate the  final   project  mark  for   individual  students  if  there   is  sufficient  evidence  that  they contributed substantially less than the other group members.