DistillPrep
PythonGenAIGenAI FrameworksNLPDeep LearningMachine LearningML LibrariesStatisticsSQLMLOpsCloudSystem Design
Blog
D

Deep Learning

Curriculum Engine

Knowledge Tracks

Mastery Insight

"Focus on topics where you've failed edge-case questions. MAANG interviewers look for conceptual depth, not speed."

Live Engine
Select Topic
easyActivation Functions
A sigmoid activation outputs values in (0,1). You use it in a hidden layer of a deep network with 10 layers. During training you observe that gradients in the first 3 layers are approximately 10⁻⁶ while gradients in the last 3 layers are approximately 0.1. What causes this disparity and what is the standard fix?
Progress0%
0 of 240 concepts cleared
Accuracy
0%
Solved
0

Question Index

Interview Tips

  • 1.Concepts over memorization.
  • 2.Identify trade-offs in every solution.