Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ceph-iscsi: add erasure pool support #237

Open
wants to merge 4 commits into
base: main
Choose a base branch
from
Open

Conversation

lxbsz
Copy link
Member

@lxbsz lxbsz commented Jun 21, 2021

Erasure coded pools do not support omap, will just store the data
in ec pool, and their metadata will be in 'rbd' replicated pool.

Signed-off-by: Xiubo Li [email protected]

@lxbsz lxbsz requested a review from idryomov June 21, 2021 09:07
@lxbsz lxbsz added the feature label Jun 21, 2021
@lxbsz lxbsz force-pushed the erasure branch 6 times, most recently from e61a60d to 949ca2d Compare June 25, 2021 02:52
@lxbsz
Copy link
Member Author

lxbsz commented Jun 25, 2021

@idryomov

All my tests worked well till now. Please review, thanks.

if self.is_erasure_pool:
_pool = settings.config.pool
data_pool = self.pool
with cluster.open_ioctx(_pool) as ioctx:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure doing it this way is a good idea. Constraining the metadata pool to SYS_SETTINGS.pool seems unnecessarily restrictive. Why not add an optional data_pool parameter to the create RPC, allowing the user to specify both pools?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, this also looks good to me.

store_type = json.loads(data)[0]['osd_objectstore']
self.logger.debug(f"pool ({self.pool}) objectstore type is ({store_type})")
if store_type not in ['bluestore', 'filestore']:
self.logger.error("Only bluestore/filestore objectstores allowed for erasure pool")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This check is bogus because one can have a cluster with OSDs backed by different stores. Querying just one OSD isn't going to do it.

But then I don't understand the need to check osd_objectstore at all. Why ensuring that EC overwrites are enabled isn't enough?

Copy link
Member Author

@lxbsz lxbsz Jun 28, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This check is bogus because one can have a cluster with OSDs backed by different stores. Querying just one OSD isn't going to do it.

Okay.

But then I don't understand the need to check osd_objectstore at all. Why ensuring that EC overwrites are enabled isn't enough?

From https://docs.ceph.com/en/latest/rados/operations/erasure-code/#erasure-coding-with-overwrites, I think maybe we should only allow the bluestore ?

If my understanding is correct, it will only works well for bluestore ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we should be trying to limit it beyond what is allowed by librbd. When creating an image with a separate data pool, librbd ensures that the specified pool supports overwrites.

The user wouldn't be able to set allow_ec_overwrites on a filestore-backed pool without also messing with mon_debug_no_require_bluestore_for_ec_overwrites. If they do that, they are on their own ;)

ceph_iscsi_config/utils.py Outdated Show resolved Hide resolved
@lxbsz
Copy link
Member Author

lxbsz commented Jul 26, 2021

Add one ecpool= option when creating the disk:

/> disks/ create image=block14 pool=datapool ecpool=ecpool size=1M

Without the ecpool= specified, it will use the pool= to save both metadata and data. Or it will use the pool= to save the metadata and use the ecpool= to save the data.

For the image with the ecpool= specified it will display the disks like: <replicated pool name>/<erasure pool name>/<image name>

[root@node01 ceph-iscsi]# gwcli ls
o- / ......................................................................................................................... [...]
  o- cluster ......................................................................................................... [Clusters: 1]
  | o- ceph .......................................................................................................... [HEALTH_WARN]
  |   o- pools .......................................................................................................... [Pools: 6]
  |   | o- cephfs.a.data ......................................................... [(x3), Commit: 0.00Y/29819642K (0%), Used: 0.00Y]
  |   | o- cephfs.a.meta ....................................................... [(x3), Commit: 0.00Y/29819642K (0%), Used: 134000b]
  |   | o- datapool ............................................................... [(x3), Commit: 3M/29819642K (0%), Used: 418943b]
  |   | o- device_health_metrics ................................................. [(x3), Commit: 0.00Y/29819642K (0%), Used: 0.00Y]
  |   | o- ecpool ................................................................. [(3+1), Commit: 0.00Y/59639284K (0%), Used: 24K]
  |   | o- rbd ................................................................. [(x3), Commit: 0.00Y/29819642K (0%), Used: 103304b]
  |   o- topology ................................................................................................ [OSDs: 3,MONs: 3]
  o- disks .......................................................................................................... [3M, Disks: 3]
  | o- datapool .................................................................................................... [datapool (3M)]
  |   o- block14 ............................................................................ [datapool/ecpool/block14 (Online, 1M)]
  |   o- block15 ................................................................................... [datapool/block15 (Online, 1M)]
  |   o- block16 ............................................................................ [datapool/ecpool/block16 (Online, 1M)]
  o- iscsi-targets ............................................................................... [DiscoveryAuth: None, Targets: 1]
    o- iqn.2003-01.com.redhat.iscsi-gw:ceph-gw ........................................................... [Auth: None, Gateways: 2]
      o- disks .......................................................................................................... [Disks: 3]
      | o- datapool/block15 ................................................................................ [Owner: node01, Lun: 0]
      | o- datapool/ecpool/block14 ......................................................................... [Owner: node02, Lun: 1]
      | o- datapool/ecpool/block16 ......................................................................... [Owner: node01, Lun: 2]
      o- gateways ............................................................................................ [Up: 2/2, Portals: 2]
      | o- node01 ............................................................................................ [172.16.219.128 (UP)]
      | o- node02 ............................................................................................ [172.16.219.138 (UP)]
      o- host-groups .................................................................................................. [Groups : 0]
      o- hosts ....................................................................................... [Auth: ACL_ENABLED, Hosts: 0]

@lxbsz lxbsz force-pushed the erasure branch 2 times, most recently from 104ad8f to c7c259c Compare November 5, 2021 02:09
@lxbsz lxbsz requested a review from ideepika November 5, 2021 03:05
@lxbsz lxbsz force-pushed the erasure branch 3 times, most recently from f1b8906 to a8daca9 Compare November 22, 2021 12:24
README Outdated Show resolved Hide resolved
tmp/keyring Outdated Show resolved Hide resolved
README Outdated
| o- disk_1 ........................................................... [rbd/disk_1 (1G)]
| o- disk_2 ........................................................... [rbd/disk_2 (2G)]
| o- disk_3 ........................................................... [rbd/disk_3 (2G)]
| o- disk_4 ........................................................ [rbd/ec/disk_4 (1G)]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This makes it look like ec is somehow a sub-pool of rbd. For displaying here, I think it would be better to do <metadata pool>+<data pool>, i.e.

  • mypool/myimg for a "normal" image
  • mymetapool+mydatapool/myimg for an image with a separate data pool

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, this looks better, I will update it.

README Outdated
curl --user admin:admin -d mode=create
-X PUT http://192.168.122.69:5000/api/ecdisksnap/rbd/ec/image/new1
curl --user admin:admin
-X DELETE http://192.168.122.69:5000/api/ecdisksnap/rbd/ec/image/new1
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I may be missing something about REST or flask specifically, but I don't understand the need for a parallel set of EC APIs (disk vs ecdisk and disksnap vs ecdisksnap). Why not just add an optional ecpool parameter? This would mimic the "rbd create" CLI (--datapool option).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thinking more about this, I think we should avoid tying this to EC at all. The actual feature that is being surfaced is a separate data pool. Let's call the new parameter datapool and

  • if not present, create a "normal" image
  • if present, get the type of the pool
    • if replicated, create with --datapool=<datapool>
    • if EC, validate EC pool (bluestore && allow_ec_overwrites=true) and create with --datapool=<datapool>

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good. Let me try this.

bluestore = False
for _osd in json.loads(data):
store_type = _osd['osd_objectstore']
self.logger.debug(f"pool ({self.pool}) objectstore type is ({store_type})")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is checking the objectstore type of the OSD, not the pool, so the log message is misleading.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will fix it.

if err:
self.logger.error(f"Cannot get allow_ec_overwrites from pool ({self.pool})")
return err
self.logger.debug(f"erasure pool ({self.pool}) allow_ec_overwrites is enabled")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The actual check seems to be missing. What if allow_ec_overwrites is disabled?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I didn't get this, could you explain more ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the allow_ec_overwrites is disabled, it should abort the image creation should fail.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see where the actual value returned by "ceph osd pool get POOL allow_ec_overwrites" is checked. It can be true or false but the code here doesn't seem to look at it. It only checks whether the command itself succeeds or not.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, get it. Will fix it.

It will return 400 error code when a disk is not assigned to any
target.

Signed-off-by: Xiubo Li <[email protected]>
@lxbsz lxbsz force-pushed the erasure branch 4 times, most recently from adfe970 to 552209f Compare December 21, 2021 03:30
@lxbsz lxbsz requested a review from idryomov January 1, 2022 01:20
README Outdated
| o- disk_5 ............................................................... [disk_5 (2G)]
| o- disk_1 ........................................................... [rbd/disk_1 (1G)]
| o- disk_2 ........................................................... [rbd/disk_2 (2G)]
| o- disk_3 ........................................................... [rbd/disk_3 (2G)]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's name the pool "rep" here (instead of "rbd")

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure.

README Outdated
| o- disk_1 ........................................................... [rbd/disk_1 (1G)]
| o- disk_2 ........................................................... [rbd/disk_2 (2G)]
| o- disk_3 ........................................................... [rbd/disk_3 (2G)]
| o- disk_4 ........................................................ [rbd+ec/disk_4 (1G)]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

... and then here it would be "rep+ec", highlighting the difference

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will do that.

README Outdated
store the metadata only.

When creating a disk and disk snapshot, for rest api there has a litte different for erasure pool.
Which is you need to use "<pool>+<datapool>" format instead of "<pool>" in URL:"http://.../<pool>/...":
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why? An image with a separate data pool is still unique within its metadata pool. For example, if one creates "rep/foo" (i.e. image "foo" in pool "rep"), they won't be able to create "rep+ec/foo". A different format in the URL shouldn't be necessary, the "datapool" parameter should be enough.

I think the only place for "metadatapool+datapool" notation is the CLI output where we try to be user friendly by making these details stand out. Everywhere else, including in URLs, we should use a conventional image spec (metadatapool/image/snap).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, let's just use the datapool par here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just the metadata pool.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I mean use the datapool par instead of rep+ec style.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you mean by "par"? Part?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

paramter.

if err:
self.logger.error(f"Cannot get allow_ec_overwrites from pool ({self.pool})")
return err
self.logger.debug(f"erasure pool ({self.pool}) allow_ec_overwrites is enabled")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see where the actual value returned by "ceph osd pool get POOL allow_ec_overwrites" is checked. It can be true or false but the code here doesn't seem to look at it. It only checks whether the command itself succeeds or not.

break

if not bluestore:
return None
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you intend for filestore (or any other objectstore except bluestore) to not produce an error? This entire function seems buggy to me because when it is called, None is interpreted as success.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will fix it.

except ValueError:
pool = pools
pass
return pool, datapool, image
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I commented on the README, but just to reiterate, parse_disk_meta shouldn't be needed.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, even we remove the rep+ec style this still will be needed in other places. Such as when loading and parsing the configures from the rbd pool.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you give a specific example? Like I said, I think rep+ec should come up only when pretty-printing. It shouldn't leak outside of that and therefore should never be parsed.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Such as from the logs:

187 2022-01-04 20:35:46,876    DEBUG [common.py:120:_read_config_object()] - _read_config_object reading the config object
188 2022-01-04 20:35:46,876    DEBUG [common.py:170:_get_ceph_config()] - (_get_rbd_config) config object contains 'b'{\n    "created": "2021/12/21 02:    18:46",\n    "discovery_auth": {\n        "mutual_password": "",\n        "mutual_password_encryption_enabled": false,\n        "mutual_username":     "",\n        "password": "",\n        "password_encryption_enabled": false,\n        "username": ""\n    },\n    "disks": {\n        "rbd+ecpool/bl    ockX1": {\n            "allocating_host": "node01",\n            "backstore": "user:rbd",\n            "backstore_object_name": "rbd.blockX1",\n                "controls": {},\n            "created": "2021/12/21 02:39:38",\n            "datapool": "ecpool",\n            "image": "blockX1",\n                "owner": "node02",\n            "pool": "rbd",\n            "pool_id": 2,\n            "updated": "2021/12/21 02:39:49",\n            "wwn":     "e121dbf4-10e2-4f60-83ca-4480da5875ce"\n        },\n        "rbd/blockX0": {\n            "allocating_host": "node01",\n            "backstore": "u    ser:rbd",\n            "backstore_object_name": "rbd.blockX0",\n            "controls": {},\n            "created": "2021/12/21 02:19:21",\n                "datapool": null,\n            "image": "blockX0",\n            "owner": "node01",\n            "pool": "rbd",\n            "pool_id": 2,\n                "updated": "2021/12/21 02:20:03",\n            "wwn": "a0f1b2c6-ad93-4761-aa47-91ea9034a442"\n        }\n    },\n    "epoch": 11,\n    "ga    teways": {\n        "node01": {\n            "active_luns": 1,\n            "created": "2021/12/21 02:19:39",\n            "updated": "2021/12/21 0    2:20:03"\n        },\n        "node02": {\n            "active_luns": 1,\n            "created": "2021/12/21 02:19:44",\n            "updated": "20    21/12/21 02:39:49"\n        }\n    },\n    "targets": {\n        "iqn.2003-01.com.redhat.iscsi-gw:ceph-igw0": {\n            "acl_enabled": true,\n                "auth": {\n                "mutual_password": "",\n                "mutual_password_encryption_enabled": false,\n                "mutua    l_username": "",\n                "password": "",\n                "password_encryption_enabled": false,\n                "username": ""\n                },\n            "clients": {\n                "iqn.1994-05.com.redhat:client": {\n                    "auth": {\n                        "mutual    _password": "",\n                        "mutual_password_encryption_enabled": false,\n                        "mutual_username": "",\n                            "password": "",\n                        "password_encryption_enabled": false,\n                        "username": ""\n                        },\n                    "group_name": "",\n                    "luns": {\n                        "rbd+ecpool/blockX1": {\n                                "lun_id": 1\n                        },\n                        "rbd/blockX0": {\n                            "lun_id": 0\n                            }\n                    }\n                }\n            },\n            "controls": {},\n            "created": "2021/12/21 02:19:33",\n                "disks": {\n                "rbd+ecpool/blockX1": {\n                    "lun_id": 1\n                },\n                "rbd/blockX0"    : {\n                    "lun_id": 0\n                }\n            },\n            "groups": {},\n            "ip_list": [\n                "172.    16.219.128",\n                "172.16.219.138"\n            ],\n            "portals": {\n                "node01": {\n                    "gateway    _ip_list": [\n                        "172.16.219.128",\n                        "172.16.219.138"\n                    ],\n                    "ina    ctive_portal_ips": [\n                        "172.16.219.138"\n                    ],\n                    "portal_ip_addresses": [\n                            "172.16.219.128"\n                    ],\n                    "tpgs": 2\n                },\n                "node02": {\n                        "gateway_ip_list": [\n                        "172.16.219.128",\n                        "172.16.219.138"\n                    ],\n                        "inactive_portal_ips": [\n                        "172.16.219.128"\n                    ],\n                    "portal_ip_addresses": [\    n                        "172.16.219.138"\n                    ],\n                    "tpgs": 2\n                }\n            },\n            "u    pdated": "2021/12/21 02:40:03"\n        }\n    },\n    "updated": "2021/12/21 02:40:03",\n    "version": 11\n}''

We can see:

"disks": { "rep+ecpool/blockX1": { "lun_id": 1 }, ....

We will save the disks as rep+ec/image style.

Copy link
Member Author

@lxbsz lxbsz Jan 4, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, and also we can save the datapool by using one datapool: ecpool par in the config, for current code I think I need to change a lot or recode many places and won't benefit much from it.

[Edit]: this style could be easily to be used by the gwcli output.

Copy link
Contributor

@idryomov idryomov Jan 4, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, we should save all disks in rep/image style, or, in other words, by the image spec. For the CLI, the image spec has the following format: pool/image@snap. In ceph-iscsi, I guess it is pool/image/snap (i.e. @ isn't used) but this minor difference is irrelevant here.

The pool in pool/image@snap is always the metadata pool. The (optional) data pool is never part of the spec. Let's not deviate from the rest of RBD in ceph-iscsi.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmmm, okay, let me recode it this week.

@lxbsz
Copy link
Member Author

lxbsz commented Jan 6, 2022

@idryomov

Have fixed them all, the gwcli ls will be like:

[root@node02 ceph-iscsi]# gwcli ls
o- / ......................................................................................................................... [...]
  o- cluster ......................................................................................................... [Clusters: 1]
  | o- ceph .......................................................................................................... [HEALTH_WARN]
  |   o- pools .......................................................................................................... [Pools: 6]
  |   | o- cephfs.a.data ......................................................... [(x3), Commit: 0.00Y/28722104K (0%), Used: 0.00Y]
  |   | o- cephfs.a.meta ....................................................... [(x3), Commit: 0.00Y/28722104K (0%), Used: 134000b]
  |   | o- datapool ........................................................ [(x3), Commit: 0.00Y/28722104K (0%), Used: 3228247703b]
  |   | o- device_health_metrics ................................................. [(x3), Commit: 0.00Y/28722104K (0%), Used: 0.00Y]
  |   | o- ecpool .............................................................. [(2+2), Commit: 0.00Y/57444208K (0%), Used: 30788K]
  |   | o- rbd ................................................................. [(x3), Commit: 36M/28722104K (0%), Used: 12500238b]
  |   o- topology ................................................................................................ [OSDs: 3,MONs: 3]
  o- disks ......................................................................................................... [36M, Disks: 3]
  | o- rbd ............................................................................................................. [rbd (36M)]
  |   o- blockX0 ...................................................................................... [rbd/blockX0 (Unknown, 12M)]
  |   o- blockX1 ...................................................................................... [rbd/blockX1 (Unknown, 12M)]
  |   o- blockX2 ............................................................................... [rbd+ecpool/blockX2 (Unknown, 12M)]
  o- iscsi-targets ............................................................................... [DiscoveryAuth: None, Targets: 1]
    o- iqn.2003-01.com.redhat.iscsi-gw:ceph-igw0 ......................................................... [Auth: None, Gateways: 2]
      o- disks .......................................................................................................... [Disks: 2]
      | o- rbd/blockX0 ..................................................................................... [Owner: node01, Lun: 0]
      | o- rbd/blockX2 ..................................................................................... [Owner: node02, Lun: 1]
      o- gateways ............................................................................................ [Up: 2/2, Portals: 2]
      | o- node01 ............................................................................................ [172.16.219.128 (UP)]
      | o- node02 ............................................................................................ [172.16.219.138 (UP)]
      o- host-groups .................................................................................................. [Groups : 0]
      o- hosts ....................................................................................... [Auth: ACL_ENABLED, Hosts: 1]
        o- iqn.1994-05.com.redhat:client .................................................... [LOGGED-IN, Auth: None, Disks: 2(24M)]
          o- lun 0 ............................................................................... [rbd/blockX0(12M), Owner: node01]
          o- lun 1 ............................................................................... [rbd/blockX2(12M), Owner: node02]

Please take a look of this, all the Rest APIs still the same as before except add a datapool=<name> parameter for some of them.

I am still testing this version, not finished yet.

BTW, do you think we need to display the <rep>+<ec>/image under the iscsi-targets ?

@lxbsz
Copy link
Member Author

lxbsz commented Jan 7, 2022

Till now all my tests work well.

@lxbsz lxbsz requested a review from idryomov January 7, 2022 08:25
README Outdated
o- lun 0 ......................................... [rep/disk_2(2G), Owner: rh7-gw1]
o- iqn.2003-01.com.redhat.iscsi-gw:ceph-gw3 ................... [Auth: None, Gateways: 2]
o- disks ................................................................... [Disks: 1]
| o- rep/disk_4 ........................................... [Owner: rh7-gw2, Lun: 0]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just wondering why this line isn't right-justified like others?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just forgot to add the ... back after deleting the +ec since last version. Will fix it.

README Outdated
o- host-groups ........................................................... [Groups : 0]
o- hosts ................................................ [Auth: ACL_ENABLED, Hosts: 1]
o- iqn.1994-05.com.redhat:rh7-client .......... [LOGGED-IN, Auth: None, Disks: 1(1G)]
o- lun 0 ...................................... [rep/disk_4(1G), Owner: rh7-gw1]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto

disk_key = "{}/{}".format(kwargs['pool'], kwargs['image'])

if mode in ['create', 'resize']:

if kwargs['pool'] not in get_pools():
return "pool name is invalid"
if datapool and datapool not in get_pools():
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Store get_pools() result and use it here to avoid issuing the same RADOS RPC twice.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, will fix it.

gwcli/storage.py Outdated
@@ -205,7 +214,8 @@ def ui_command_create(self, pool=None, image=None, size=None, backstore=None, ww

The syntax of each parameter is as follows;
pool : Pool and image name may contain a-z, A-Z, 0-9, '_', or '-'
image characters.
datapool: Data pool name for erasure code pool may contain a-z, A-Z, 0-9, '_', or '-'
image : characters.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The lines are mixed up here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

gwcli/storage.py Outdated
return True

self.logger.error("Invalid pool ({}). Must already exist and "
"be replicated".format(pool))
self.logger.error(f"Invalid pool ({pool}), the type is ({pool_object.type})."
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
self.logger.error(f"Invalid pool ({pool}), the type is ({pool_object.type})."
self.logger.error(f"Invalid pool '{pool}', the type is '{pool_object.type}'."

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will fix it.

gwcli/storage.py Outdated
@@ -645,6 +665,8 @@ def __init__(self, parent, image_id, image_config, size=None,

UINode.__init__(self, self.rbd_image, parent)

self.datapool = image_config.get('datapool', None)
self.is_erasure = True if self.datapool is not None else False
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like is_erasure is not used anywhere and could be removed. (And again, an image with a separate data pool isn't necessarily an EC image.)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, a stale parameter from old version, will remove it.

@@ -1367,7 +1374,7 @@ def lun_reconfigure(image_id, controls, backstore):
api_vars['controls'] = json.dumps(controls)

# activate disk
api_vars['mode'] = 'activate'
api_vars = {'mode': 'activate'}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This might drop controls key from the dictionary added just a couple of lines above. Did you intend to do that?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch, will fix it.

gwcli/storage.py Outdated
@@ -153,18 +161,18 @@ def reset(self):
for child in children:
self.remove_child(child)

def ui_command_attach(self, pool=None, image=None, backstore=None, wwn=None):
def ui_command_attach(self, pool=None, image=None, backstore=None, wwn=None, datapool=None):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Attach (i.e. adding an existing image to the gateway) shouldn't ask for datapool. Given that the parameter is optional, the user can always omit it even if the image has a separate data pool. So it is flawed from the get go.

Think of the datapool parameter like the size parameter: we ask for size only on create (and resize, but resize is irrelevant here). The same goes for data pool -- it only affects the creation of the image and should be asked for only on create.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, right, I just missed the create_image parameter. Will fix it.

disk_key = "{}/{}".format(kwargs['pool'], kwargs['image'])

if mode in ['create', 'resize']:

if kwargs['pool'] not in get_pools():
return "pool name is invalid"
if datapool and datapool not in get_pools():
return "datapool name is invalid"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This error message could be more precise: a pool with a given name just doesn't exist. The name itself may very well be valid (in the sense of consisting of valid characters, etc), there is a separate check and an error message for that below.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good and will fix it.

def _erasure_pool_check(self):
# skip it and if no ecpool specified
if not self.datapool:
return True
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A data pool doesn't have to be an EC pool. A user may want to create an image with a separate replicated pool. So two things:

  • _erasure_pool_check() should check the pool type and, only if the pool is EC, check whether allow_ec_overwrites is enabled
  • The objectstore check should be removed. As I already noted in ceph-iscsi: add erasure pool support #237 (comment), enabling allow_ec_overwrites on any object store other than bluestore isn't possible in normal configurations so allow_ec_overwrites check would be sufficient.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, will fix it.

@idryomov
Copy link
Contributor

idryomov commented Jan 8, 2022

BTW, do you think we need to display the <rep>+<ec>/image under the iscsi-targets ?

Probably not.

@lxbsz lxbsz force-pushed the erasure branch 2 times, most recently from cb3a5a6 to 19c8683 Compare January 10, 2022 06:23
Copy link
Contributor

@idryomov idryomov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think you meant to include gwcli/storage.py.orig file in the PR. I can't see gwcli/storage.py diff because of that.

The erasure coded pool does not support omap, here we will just store
the data in it and will store the metadata in a replicated pool.

This will add a "datapool=<ec_pool_name>" parameter to specify the
erasure coded pool to store the data when creating a disk, and the
"pool=<name>", which is "rbd" as default, will be used to store the
metadata only.

The backstore object name is <pool>.<image> in the replicated pool,
which is the same with the none-erasure image.

Signed-off-by: Xiubo Li <[email protected]>
@lxbsz
Copy link
Member Author

lxbsz commented Jan 18, 2022

I don't think you meant to include gwcli/storage.py.orig file in the PR. I can't see gwcli/storage.py diff because of that.

Removed. Thanks.

@lxbsz
Copy link
Member Author

lxbsz commented Feb 14, 2022

@idryomov Please take a look when you are free, thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants