net/mlx5: Skip mlx5_unload_one if mlx5_load_one fails
authorHuy Nguyen <huyn@mellanox.com>
Tue, 8 Aug 2017 18:17:00 +0000 (13:17 -0500)
committerSaeed Mahameed <saeedm@mellanox.com>
Wed, 30 Aug 2017 18:20:43 +0000 (21:20 +0300)
There is an issue where the firmware fails during mlx5_load_one,
the health_care timer detects the issue and schedules a health_care call.
Then the mlx5_load_one detects the issue, cleans up and quits. Then
the health_care starts and calls mlx5_unload_one to clean up the resources
that no longer exist and causes kernel panic.

The root cause is that the bit MLX5_INTERFACE_STATE_DOWN is not set
after mlx5_load_one fails. The solution is removing the bit
MLX5_INTERFACE_STATE_DOWN and quit mlx5_unload_one if the
bit MLX5_INTERFACE_STATE_UP is not set. The bit MLX5_INTERFACE_STATE_DOWN
is redundant and we can use MLX5_INTERFACE_STATE_UP instead.

Fixes: 5fc7197d3a25 ("net/mlx5: Add pci shutdown callback")
Signed-off-by: Huy Nguyen <huyn@mellanox.com>
Reviewed-by: Daniel Jurgens <danielj@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
drivers/net/ethernet/mellanox/mlx5/core/main.c
include/linux/mlx5/driver.h

index c065132b956d6ba772f812bff21a190d5759bf13..4cdb414aa2d51713c565c8bfb2ae1b00c33a2c50 100644 (file)
@@ -1186,7 +1186,6 @@ static int mlx5_load_one(struct mlx5_core_dev *dev, struct mlx5_priv *priv,
                }
        }
 
-       clear_bit(MLX5_INTERFACE_STATE_DOWN, &dev->intf_state);
        set_bit(MLX5_INTERFACE_STATE_UP, &dev->intf_state);
 out:
        mutex_unlock(&dev->intf_state_mutex);
@@ -1261,7 +1260,7 @@ static int mlx5_unload_one(struct mlx5_core_dev *dev, struct mlx5_priv *priv,
                mlx5_drain_health_recovery(dev);
 
        mutex_lock(&dev->intf_state_mutex);
-       if (test_bit(MLX5_INTERFACE_STATE_DOWN, &dev->intf_state)) {
+       if (!test_bit(MLX5_INTERFACE_STATE_UP, &dev->intf_state)) {
                dev_warn(&dev->pdev->dev, "%s: interface is down, NOP\n",
                         __func__);
                if (cleanup)
@@ -1270,7 +1269,6 @@ static int mlx5_unload_one(struct mlx5_core_dev *dev, struct mlx5_priv *priv,
        }
 
        clear_bit(MLX5_INTERFACE_STATE_UP, &dev->intf_state);
-       set_bit(MLX5_INTERFACE_STATE_DOWN, &dev->intf_state);
 
        if (mlx5_device_registered(dev))
                mlx5_detach_device(dev);
index df6ce59a1f954257cdef95a7733736e42c8b9491..918f5e644506d29a14a78e3d02e1bca2a66de225 100644 (file)
@@ -673,9 +673,8 @@ enum mlx5_device_state {
 };
 
 enum mlx5_interface_state {
-       MLX5_INTERFACE_STATE_DOWN = BIT(0),
-       MLX5_INTERFACE_STATE_UP = BIT(1),
-       MLX5_INTERFACE_STATE_SHUTDOWN = BIT(2),
+       MLX5_INTERFACE_STATE_UP = BIT(0),
+       MLX5_INTERFACE_STATE_SHUTDOWN = BIT(1),
 };
 
 enum mlx5_pci_status {