This paper extends presents a new activation function and its utility:
This paper presents and advocates for using periodic activation function for learning implicit neural representations. The idea is relatively straightforward - use the sine function as the activation function between neural network layers (sin(Wx+b)). Unlike ReLU's which have second derivative zero everywhere these are much more expressible and able to capture fine details. Also these networks (SIREN) have some interesting properties such as derivative of SIREN is a SIREN. This allows SIREN to behave well even when directly supervised on its derivatives. The authors explain through a simple example that SIREN is able to capture much finer details than ReLU in a video learning tasks which shows the inherent limitations of ReLU. Since ReLU are also universal function approximators I wonder if authors could also have done an analysis to show that deeper ReLU can actually accomplish those tasks and show that SIREN can do the same with much smaller networks.
The authors discuss that SIREN does not work without carefully chosen initialization and present an approach that ensures that initialization output is independent to number of layers to some extent. They do this by analysis the range and distribution of output of each layer. The experiments are convincing in showing that the same initialization work across all domains which lend empirical credibility to the initialization theory.
The paper present experiments on a diverse set of tasks from Solving an PDE, Learning a signed distance function, Solving wave equations and recovering waveforms. They observe that SIREN can solve an PDE by supervising purely on the derivatives of the function. In other tasks, they formulate each problem as a constrained optimization and solve it with a soft penalty loss. They compare their method with a discretization(grid) baseline, ReLU and tanh and show that SIREN outperforms all the baseline. The authors also experiment learning a prior over SIREN using hypernetworks. They experiment on the particular task of in-painting where they show to outperform baselines and demonstrating generalization over SIREN representations is powerful. Overall I think the paper is clear to read and follow and has empirical evidence to support its claims.