博士生成长之路 Path to A Senior PhD


目录


欢迎大家补充和交流


一、动机 Motivation


“要不要读博士”,是一个非常重要严肃的问题。

对于这个问题,不管你的答案是什么,也请你看完这个小节的内容。也许我的观点不算正确,但也希望或多或少地能帮助到你。

读博是一个漫长的过程,对于直博的同学需要度过5-6年,对于硕士毕业接着读博的同学也是4-5年的日子。这段时间往往占据了一半的青春时光(20-30岁),这可是人生最美好的年华。所以,在思考这个问题和做出自己的决定前,请慎之又慎。在决定这个问题的答案时,请遵从自己内心的声音,思考以下问题:

1. 我的性格和能力适合读博士?

2. 我读博士是为了什么?

3. 我能接受最坏的结果吗?

4. 如果你看到这里还没有被我劝退,那么恭喜你,你已经成功迈出了第一步!

[回到目录]


二、科研必备技能 Research Skills

1. 代码相关

2. 文档

写文档是一个非常有必要的习惯,是对平时工作的一个记录和积累,在需要用到的时候可以给你提供便利,磨刀不误砍柴功,例如将来某个时候需要参考、做汇报时需要素材、需要经常解决常见问题等等。

3. PPT

4. 课程

参考 入门3D Vision的科研

5. 英语

[回到目录]


培养科研能力 Research Capability

博士生应该具有的能力、如何锻炼这些能力、如何做科研项目:[参考1][参考2]

博士生应该具有的能力

[回到目录]


四、论文撰写 Paper Writing

论文写作模板:[参考]

论文写作(简略版)

论文写作(详细版)

Title

  1. 标题很重要,因为不同的标题很可能会吸引不同领域的reviewers。起标题前,要先写下一些重要的关键词,然后根据这些关键词起标题。

Abstract

  1. 怎么写出好的abstract:(1) 想abstract的写作思路。(2) 套下面的写作模板。(3) 反复改abstract。
  2. 关键是写之前先逐个回答下面的问题。
  1. 版本1:介绍technical challenge,再一两句话介绍解决challenge的technical contribution
    \section{Abstract}
        % Task
        % Technical challenge for previous methods (围绕我们解决了的technical challenge展开讨论)
        % 一两句话介绍解决challenge的technical contribution (一般就提到xxx技术的名词,不会讲具体的每个步骤。这个名词要让人读得懂,不要有jump的感觉。这个能力对写好abstract很重要。)
        % 介绍technical contribution的好处
        % Experiment
        
  2. 版本2:介绍technical challenge,再一两句话介绍解决challenge的insight,再一句话介绍实现insight的technical contribution。(个人比较推荐这个写法)
    \section{Abstract}
        % Task
        %% 例子1: In recent years, generative models have undergone significant advancement due to the success of diffusion models.
        %% 例子2: This paper addresses the challenge of novel view synthesis for a human performer from a very sparse set of camera views.
        
        % Technical challenge for previous methods (围绕我们解决了的technical challenge展开讨论)
        %% 例子1: The success of these models is often attributed to their use of guidance techniques, such as classifier and classifier-free methods, which provides effective mechanisms to tradeoff between fidelity and diversity. However, these methods are not capable of guiding a generated image to be aware of its geometric configuration, e.g., depth, which hinders the application of diffusion models to areas that require a certain level of depth awareness.
        %% 例子2: Some recent works have shown that learning implicit neural representations of 3D scenes achieves remarkable view synthesis quality given dense input views. However, the representation learning will be ill-posed if the views are highly sparse.
        
        % 一句话介绍解决challenge的insight
        %% 例子1: To address this limitation, we propose a novel guidance approach for diffusion models that uses estimated depth information derived from the rich intermediate representations of diffusion models.
        %% 例子2: To solve this ill-posed problem, our key idea is to integrate observations over video frames.
        
        % 一两句话介绍实现insight的technical contribution (一般就提到xxx技术的名词,不会讲具体的每个步骤。这个名词要让人读得懂,不要有jump的感觉。这个能力对写好abstract很重要。)
        %% 例子1: To do this, we first present a label-efficient depth estimation framework using the internal representations of diffusion models. At the sampling phase, we utilize two guidance techniques to self-condition the generated image using the estimated depth map, the first of which uses pseudo-labeling, and the subsequent one uses a depth-domain diffusion prior.
        %% 例子2: To this end, we propose Neural Body, a new human body representation which assumes that the learned neural representations at different frames share the same set of latent codes anchored to a deformable mesh
        
        % 介绍technical novelty的好处
        %% 例子2: so that the observations across frames can be naturally integrated. The deformable mesh also provides geometric guidance for the network to learn 3D representations more efficiently.
        
        % Experiment
        
  3. 版本3:存在多个technical contributions,分别描述technical contribution的和technical advantage
    % Task
        %% This paper introduces a novel contour-based approach named deep snake for real-time instance segmentation.
        
        %% Unlike some recent methods that directly regress the coordinates of the object boundary points from an image
        
        % 一句话介绍technical contribution和technical advantage (这个能力对写好abstract很重要。)
        %% deep snake uses a neural network to iteratively deform an initial contour to match the object boundary, which implements the classic idea of snake algorithms with a learning-based approach.
        
        % 一句话介绍technical contribution和technical advantage
        %% For structured feature learning on the contour, we propose to use circular convolution in deep snake, which better exploits the cycle-graph structure of a contour compared against generic graph convolution.
        
        % 一句话介绍technical contribution和technical advantage
        %% Based on deep snake, we develop a two-stage pipeline for instance segmentation: initial contour proposal and contour deformation, which can handle errors in object localization.
        
        % Experiment
        

Introduction

  1. 怎么写出好的introduction:(1) 想introduction的写作思路。(2) 套下面的写作模板。(3) 反复改introduction。
  2. 怎么想Introduction的写作思路:倒推,然后正推。
\section{Introduction}
    % Task and application
    % Technical challenge for previous methods (围绕我们解决了的technical challenge展开讨论。Technical challenge包括limitation和technical reason)
    % 介绍解决challenge的our pipeline
    % Experiment
    % Contributions
    
  1. 介绍 Task and application
  1. 介绍 Technical challenge for previous methods
  1. 介绍解决challenge的our pipeline
  1. 介绍experiment and contributions

Method

  1. 怎么写清楚method:(1) 回答下面的问题。(2) 画pipeline figure的草图。 (3) 按步骤写method。
  2. 写method的步骤:
  3. 什么是pipeline module的四元素:
  4. 怎么检查自己的Method是否easy-to-understand的
    % Overview
        % 一两句话介绍setting
        %% 例子1: Given a sparse multi-view video of a performer, our task is to generate a free-viewpoint video of the performer.
        %% 例子2: Given an image, the task of pose estimation is to detect objects and estimate their orientations and translations in the 3D space.
        
        % 一两句话介绍论文的core contribution
        %% 例子1: We build upon prior work for static scenes [46], to which we add the notion of time, and estimate 3D motion by explicitly modeling forward and backward scene flow as dense 3D vector fields.
        %% 例子2: Inspired by [21, 25], we perform object segmentation by deforming an initial contour to match object boundary.
        %% 例子3: Inspired by recent methods [29, 30, 36], we estimate the object pose using a two-stage pipeline: we first detect 2D object keypoints using CNNs and then compute 6D pose parameters using the PnP algorithm. Our innovation is in a new representation for 2D object keypoints as well as a modified PnP algorithm for pose estimation.
        
        % 如果有论文pipeline/framework比较novel,画一张图介绍pipeline/framework
        %% 例子: The overview of the proposed model is illustrated in Figure 3.
        
        % Section 3.1描述了什么
        %% 例子1: Neural Body starts from a set of structured latent codes attached to the surface of a deformable human model (Section 3.1).
        %% 例子2: In this section, we first describe how to model 3D scenes with MLP maps (Section 3.1).
        
        % Section 3.2描述了什么
        %% 例子1: The latent code at any location around the surface can be obtained with a code diffusion process (Section 3.2) and then decoded to density and color values by neural networks (Section 3.3).
        %% 例子2: Then, Section 3.2 discusses how to represent volumetric videos with dynamic MLP maps.
        
        % Section 3.3描述了什么
        %% 例子3: Finally, we introduce some strategies to speed up the rendering process (Section 3.3).
        
  5. Section 3.1
    % 1. 先描述该技术的forward process或者module design(先总结我们要做什么,然后写我们怎么做的:给定输入,经过xx步骤,得到输出。也就是"Given xxx, we first xxx, then xxx, finally xxx")
        % 1.1 我们要做什么
        %% 例子: Given the input features defined on a contour, deep snake introduces the circular convolution for the feature learning, as illustrated in Figure 2.
        
        % 我们怎么做的
        % 1.2 we first do xx.
        %% 例子: 首先构造circular convolution。
        
        % 1.3 then, we do xx.
        %% 例子: Similar to the standard convolution, we can construct a network layer based on the circular convolution for feature learning, which is easy to be integrated into a modern network architecture.
        
        % 1.4 finally, we do xx.
        %% 例子: After the feature learning, deep snake applies three 1×1 convolution layers to the output features for each vertex and predicts vertex-wise offsets between contour points and the target points, which are used to deform the contour.
        
        % 2. 再描述该技术的technical advantage(motivation)
        %% 例子: As discussed in the introduction, the proposed circular convolution better exploits the circular structure of the contour than the generic graph convolution. We will show the experimental comparison in Section 5.2. An alternative method is to use standard CNNs to regress a pixel-wise vector field from the input image to guide the evolution of the initial contour [37, 33, 40]. We argue that an important advantage of deep snake over the standard CNNs is the object-level structured prediction, i.e., the offset prediction at a vertex depends on other vertices of the same contour. Therefore, deep snake will predict a more reasonable offset for a vertex located far from the object. Standard CNNs may have difficulty in this case, as the regressed vector field may drive this vertex to another object which is closer.
        
    % 1. 先写motivation(为什么要提出这个技术)
        %% 例子: The implicit fields assign the density and color to each point in the 3D space, which requires us to query the latent codes at continuous 3D locations. This can be achieved with the trilinear interpolation. However, since the structured latent codes are relatively sparse in the 3D space, directly interpolating the latent codes leads to zero vectors at most 3D points. To solve this problem, we diffuse the latent codes defined on the surface to nearby 3D space.
        
        % 2. 再描述该技术的forward process或者module design(先总结我们要做什么,然后写我们怎么做的:给定输入,经过xx步骤,得到输出。也就是"Given xxx, we first xxx, then xxx, finally xxx")
        % 2.1 我们要做什么
        %% 例子: Inspired by [65, 56, 49], we choose the SparseConvNet [21] to efficiently process the structured latent codes, whose architecture is described in Table 1. 
        
        % 我们怎么做的
        % 2.2 we first do xx.
        %% 例子: Specifically, based on the SMPL parameters, we compute the 3D bounding box of the human and divide the box into small voxels with voxel size of 5mm × 5mm × 5mm. The latent code of a nonempty voxel is the mean of latent codes of SMPL vertices inside this voxel.
        
        % 2.3 then, we do xx.
        %% 例子: SparseConvNet utilizes 3D sparse convolutions to process the input volume and output latent code volumes with 2×, 4×, 8×, 16× downsampled sizes. With the convolution and downsampling, the input codes are diffused to nearby space. 
        
        % 2.4 finally, we do xx.
        %% 例子: Following [56], for any point in 3D space, we interpolate the latent codes from multi-scale code volumes of network layers 5, 9, 13, 17, and concatenate them into the final latent code. Since the code diffusion should not be affected by the human position and orientation in the world coordinate system, we transform the code locations to the SMPL coordinate system.
        
  6. 论文画图

Experiment

  1. 要写出好的Experiments,需要回答三个问题:
  2. Experiments的文字部分,比较重要的是figure和table的caption。
  3. 实验图表的排版技巧:单栏的图/表,放在论文的右栏比较好看,因为人的阅读习惯会从左上角找第一行文字。
  4. 做哪些comparison experiments
  5. 做哪些ablation studies
  6. 做哪些applications/demo(和论文的影响力有非常大的关系)

Related Work

  1. 要写出好的Related work,步骤为:

Conclusion

  1. 需要写Limitation,不然reviewer经常会把“没写limitation”当作weakness。Limitation一般写的是因为task goal或者task setting而导致的limitation(类似于讨论future work),不要写技术上存在的缺陷。

怎么改论文

  1. 需要注意论文中所有的claim(特别是abstract和introduction里的claim),需要不犯错,也得有实验support。不然一些reviewer会直接以此拒掉论文。
  2. 在论文的最后加一个自我评审的question lists,分为五方面,给这五方面分别提问题,然后根据这些问题改论文:
  3. 保证论文质量的非常重要的方式:追求完美主义。

[回到目录]


参考文档 Reference

[1] Learning Research by Sida Peng from ZJU

[回到目录]