Discussion:
[pve-devel] qemu memory hotplug limited to 64 dimm
Alexandre DERUMIER
2016-12-12 15:51:11 UTC
Permalink
Hi,

a proxmox user has reported problem to hotplug memory with high values (48G)

https://forum.proxmox.com/threads/apt-get-upgrade-to-update-proxmox-server.30992/


Seem that qemu check about a vhost value

https://lists.nongnu.org/archive/html/qemu-devel/2015-07/msg05998.html

cat /sys/module/vhost/parameters/max_mem_regions > 64 (verified on my servers).


I think this limit was not present when we have implemented hotplug.

Not sure was we can do, but I think our method don't scale anymore :(

Maybe should we introduce a new fixed "dimm size" option, to allow hotplug of more memory ?
Alexandre DERUMIER
2016-12-12 15:54:39 UTC
Permalink
vhost 64 limit has been introduced here, because of qemu crash with bigger value

https://patchwork.kernel.org/patch/6709021/

----- Mail original -----
De: "aderumier" <***@odiso.com>
À: "pve-devel" <pve-***@pve.proxmox.com>
Envoyé: Lundi 12 Décembre 2016 16:51:11
Objet: [pve-devel] qemu memory hotplug limited to 64 dimm

Hi,

a proxmox user has reported problem to hotplug memory with high values (48G)

https://forum.proxmox.com/threads/apt-get-upgrade-to-update-proxmox-server.30992/


Seem that qemu check about a vhost value

https://lists.nongnu.org/archive/html/qemu-devel/2015-07/msg05998.html

cat /sys/module/vhost/parameters/max_mem_regions > 64 (verified on my servers).


I think this limit was not present when we have implemented hotplug.

Not sure was we can do, but I think our method don't scale anymore :(

Maybe should we introduce a new fixed "dimm size" option, to allow hotplug of more memory ?



_______________________________________________
pve-devel mailing list
pve-***@pve.proxmox.com
http://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
Dietmar Maurer
2016-12-12 16:09:41 UTC
Permalink
Can't we simply increase that max_mem_regions default value?
Post by Alexandre DERUMIER
vhost 64 limit has been introduced here, because of qemu crash with bigger value
https://patchwork.kernel.org/patch/6709021/
----- Mail original -----
Envoyé: Lundi 12 Décembre 2016 16:51:11
Objet: [pve-devel] qemu memory hotplug limited to 64 dimm
Hi,
a proxmox user has reported problem to hotplug memory with high values (48G)
https://forum.proxmox.com/threads/apt-get-upgrade-to-update-proxmox-server.30992/
Seem that qemu check about a vhost value
https://lists.nongnu.org/archive/html/qemu-devel/2015-07/msg05998.html
cat /sys/module/vhost/parameters/max_mem_regions > 64 (verified on my servers).
I think this limit was not present when we have implemented hotplug.
Not sure was we can do, but I think our method don't scale anymore :(
Maybe should we introduce a new fixed "dimm size" option, to allow hotplug of more memory ?
_______________________________________________
pve-devel mailing list
http://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
_______________________________________________
pve-devel mailing list
http://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
Alexandre DERUMIER
2016-12-12 18:29:54 UTC
Permalink
Post by Alexandre DERUMIER
Post by Dietmar Maurer
Can't we simply increase that max_mem_regions default value?
I'm trying to see why they have limited it to 64.

maybe a qemu bug with previous qemu version? don't known.

I have some big spare servers to test with 370G ram, i'll try to do test with them.


----- Mail original -----
De: "dietmar" <***@proxmox.com>
À: "aderumier" <***@odiso.com>, "pve-devel" <pve-***@pve.proxmox.com>
Envoyé: Lundi 12 Décembre 2016 17:09:41
Objet: Re: [pve-devel] qemu memory hotplug limited to 64 dimm

Can't we simply increase that max_mem_regions default value?
Post by Alexandre DERUMIER
vhost 64 limit has been introduced here, because of qemu crash with bigger value
https://patchwork.kernel.org/patch/6709021/
----- Mail original -----
Envoyé: Lundi 12 Décembre 2016 16:51:11
Objet: [pve-devel] qemu memory hotplug limited to 64 dimm
Hi,
a proxmox user has reported problem to hotplug memory with high values (48G)
https://forum.proxmox.com/threads/apt-get-upgrade-to-update-proxmox-server.30992/
Seem that qemu check about a vhost value
https://lists.nongnu.org/archive/html/qemu-devel/2015-07/msg05998.html
cat /sys/module/vhost/parameters/max_mem_regions > 64 (verified on my servers).
I think this limit was not present when we have implemented hotplug.
Not sure was we can do, but I think our method don't scale anymore :(
Maybe should we introduce a new fixed "dimm size" option, to allow hotplug of more memory ?
_______________________________________________
pve-devel mailing list
http://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
_______________________________________________
pve-devel mailing list
http://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
Alexandre DERUMIER
2016-12-12 18:42:03 UTC
Permalink
I found some kernel patch to bump it to 509 (seem that another kvm fixed was needed)

http://www.spinics.net/lists/kvm/msg117654.html

But I think it was never commited to the kernel.

I'll try follow the discuss.

----- Mail original -----
De: "aderumier" <***@odiso.com>
À: "dietmar" <***@proxmox.com>
Cc: "pve-devel" <pve-***@pve.proxmox.com>
Envoyé: Lundi 12 Décembre 2016 19:29:54
Objet: Re: [pve-devel] qemu memory hotplug limited to 64 dimm
Post by Alexandre DERUMIER
Post by Dietmar Maurer
Can't we simply increase that max_mem_regions default value?
I'm trying to see why they have limited it to 64.

maybe a qemu bug with previous qemu version? don't known.

I have some big spare servers to test with 370G ram, i'll try to do test with them.


----- Mail original -----
De: "dietmar" <***@proxmox.com>
À: "aderumier" <***@odiso.com>, "pve-devel" <pve-***@pve.proxmox.com>
Envoyé: Lundi 12 Décembre 2016 17:09:41
Objet: Re: [pve-devel] qemu memory hotplug limited to 64 dimm

Can't we simply increase that max_mem_regions default value?
Post by Alexandre DERUMIER
vhost 64 limit has been introduced here, because of qemu crash with bigger value
https://patchwork.kernel.org/patch/6709021/
----- Mail original -----
Envoyé: Lundi 12 Décembre 2016 16:51:11
Objet: [pve-devel] qemu memory hotplug limited to 64 dimm
Hi,
a proxmox user has reported problem to hotplug memory with high values (48G)
https://forum.proxmox.com/threads/apt-get-upgrade-to-update-proxmox-server.30992/
Seem that qemu check about a vhost value
https://lists.nongnu.org/archive/html/qemu-devel/2015-07/msg05998.html
cat /sys/module/vhost/parameters/max_mem_regions > 64 (verified on my servers).
I think this limit was not present when we have implemented hotplug.
Not sure was we can do, but I think our method don't scale anymore :(
Maybe should we introduce a new fixed "dimm size" option, to allow hotplug of more memory ?
_______________________________________________
pve-devel mailing list
http://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
_______________________________________________
pve-devel mailing list
http://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
_______________________________________________
pve-devel mailing list
pve-***@pve.proxmox.com
http://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
Alexandre DERUMIER
2016-12-12 18:46:56 UTC
Permalink
also here:
https://patchwork.kernel.org/patch/5825741/


Seem that increasing regions decrease performance, because of extra lookups.
(I think with more regions, it's some kind of tree, where your need to lookup first the parent region, then the child region)


----- Mail original -----
De: "aderumier" <***@odiso.com>
À: "pve-devel" <pve-***@pve.proxmox.com>
Cc: "dietmar" <***@proxmox.com>
Envoyé: Lundi 12 Décembre 2016 19:42:03
Objet: Re: [pve-devel] qemu memory hotplug limited to 64 dimm

I found some kernel patch to bump it to 509 (seem that another kvm fixed was needed)

http://www.spinics.net/lists/kvm/msg117654.html

But I think it was never commited to the kernel.

I'll try follow the discuss.

----- Mail original -----
De: "aderumier" <***@odiso.com>
À: "dietmar" <***@proxmox.com>
Cc: "pve-devel" <pve-***@pve.proxmox.com>
Envoyé: Lundi 12 Décembre 2016 19:29:54
Objet: Re: [pve-devel] qemu memory hotplug limited to 64 dimm
Post by Alexandre DERUMIER
Post by Dietmar Maurer
Can't we simply increase that max_mem_regions default value?
I'm trying to see why they have limited it to 64.

maybe a qemu bug with previous qemu version? don't known.

I have some big spare servers to test with 370G ram, i'll try to do test with them.


----- Mail original -----
De: "dietmar" <***@proxmox.com>
À: "aderumier" <***@odiso.com>, "pve-devel" <pve-***@pve.proxmox.com>
Envoyé: Lundi 12 Décembre 2016 17:09:41
Objet: Re: [pve-devel] qemu memory hotplug limited to 64 dimm

Can't we simply increase that max_mem_regions default value?
Post by Alexandre DERUMIER
vhost 64 limit has been introduced here, because of qemu crash with bigger value
https://patchwork.kernel.org/patch/6709021/
----- Mail original -----
Envoyé: Lundi 12 Décembre 2016 16:51:11
Objet: [pve-devel] qemu memory hotplug limited to 64 dimm
Hi,
a proxmox user has reported problem to hotplug memory with high values (48G)
https://forum.proxmox.com/threads/apt-get-upgrade-to-update-proxmox-server.30992/
Seem that qemu check about a vhost value
https://lists.nongnu.org/archive/html/qemu-devel/2015-07/msg05998.html
cat /sys/module/vhost/parameters/max_mem_regions > 64 (verified on my servers).
I think this limit was not present when we have implemented hotplug.
Not sure was we can do, but I think our method don't scale anymore :(
Maybe should we introduce a new fixed "dimm size" option, to allow hotplug of more memory ?
_______________________________________________
pve-devel mailing list
http://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
_______________________________________________
pve-devel mailing list
http://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
_______________________________________________
pve-devel mailing list
pve-***@pve.proxmox.com
http://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
Alexandre DERUMIER
2016-12-13 15:12:55 UTC
Permalink
I have asked to user to modprobe vhost module with region at 509, seem to works fine.

maybe can we force it in kernel directly ?


----- Mail original -----
De: "aderumier" <***@odiso.com>
À: "pve-devel" <pve-***@pve.proxmox.com>
Envoyé: Lundi 12 Décembre 2016 19:46:56
Objet: Re: [pve-devel] qemu memory hotplug limited to 64 dimm

also here:
https://patchwork.kernel.org/patch/5825741/


Seem that increasing regions decrease performance, because of extra lookups.
(I think with more regions, it's some kind of tree, where your need to lookup first the parent region, then the child region)


----- Mail original -----
De: "aderumier" <***@odiso.com>
À: "pve-devel" <pve-***@pve.proxmox.com>
Cc: "dietmar" <***@proxmox.com>
Envoyé: Lundi 12 Décembre 2016 19:42:03
Objet: Re: [pve-devel] qemu memory hotplug limited to 64 dimm

I found some kernel patch to bump it to 509 (seem that another kvm fixed was needed)

http://www.spinics.net/lists/kvm/msg117654.html

But I think it was never commited to the kernel.

I'll try follow the discuss.

----- Mail original -----
De: "aderumier" <***@odiso.com>
À: "dietmar" <***@proxmox.com>
Cc: "pve-devel" <pve-***@pve.proxmox.com>
Envoyé: Lundi 12 Décembre 2016 19:29:54
Objet: Re: [pve-devel] qemu memory hotplug limited to 64 dimm
Post by Alexandre DERUMIER
Post by Dietmar Maurer
Can't we simply increase that max_mem_regions default value?
I'm trying to see why they have limited it to 64.

maybe a qemu bug with previous qemu version? don't known.

I have some big spare servers to test with 370G ram, i'll try to do test with them.


----- Mail original -----
De: "dietmar" <***@proxmox.com>
À: "aderumier" <***@odiso.com>, "pve-devel" <pve-***@pve.proxmox.com>
Envoyé: Lundi 12 Décembre 2016 17:09:41
Objet: Re: [pve-devel] qemu memory hotplug limited to 64 dimm

Can't we simply increase that max_mem_regions default value?
Post by Alexandre DERUMIER
vhost 64 limit has been introduced here, because of qemu crash with bigger value
https://patchwork.kernel.org/patch/6709021/
----- Mail original -----
Envoyé: Lundi 12 Décembre 2016 16:51:11
Objet: [pve-devel] qemu memory hotplug limited to 64 dimm
Hi,
a proxmox user has reported problem to hotplug memory with high values (48G)
https://forum.proxmox.com/threads/apt-get-upgrade-to-update-proxmox-server.30992/
Seem that qemu check about a vhost value
https://lists.nongnu.org/archive/html/qemu-devel/2015-07/msg05998.html
cat /sys/module/vhost/parameters/max_mem_regions > 64 (verified on my servers).
I think this limit was not present when we have implemented hotplug.
Not sure was we can do, but I think our method don't scale anymore :(
Maybe should we introduce a new fixed "dimm size" option, to allow hotplug of more memory ?
_______________________________________________
pve-devel mailing list
http://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
_______________________________________________
pve-devel mailing list
http://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
_______________________________________________
pve-devel mailing list
pve-***@pve.proxmox.com
http://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-devel

_______________________________________________
pve-devel mailing list
pve-***@pve.proxmox.com
http://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
Dietmar Maurer
2016-12-13 15:26:06 UTC
Permalink
Post by Alexandre DERUMIER
I have asked to user to modprobe vhost module with region at 509, seem to works fine.
maybe can we force it in kernel directly ?
yes, seems to be easy:

drivers/vhost/vhost.c:static ushort max_mem_regions = 64;

What value should we set exactly?

Any drawbacks?
Post by Alexandre DERUMIER
----- Mail original -----
Envoyé: Lundi 12 Décembre 2016 19:46:56
Objet: Re: [pve-devel] qemu memory hotplug limited to 64 dimm
https://patchwork.kernel.org/patch/5825741/
Seem that increasing regions decrease performance, because of extra lookups.
(I think with more regions, it's some kind of tree, where your need to lookup
first the parent region, then the child region)
----- Mail original -----
Envoyé: Lundi 12 Décembre 2016 19:42:03
Objet: Re: [pve-devel] qemu memory hotplug limited to 64 dimm
I found some kernel patch to bump it to 509 (seem that another kvm fixed was needed)
http://www.spinics.net/lists/kvm/msg117654.html
But I think it was never commited to the kernel.
I'll try follow the discuss.
----- Mail original -----
Envoyé: Lundi 12 Décembre 2016 19:29:54
Objet: Re: [pve-devel] qemu memory hotplug limited to 64 dimm
Post by Alexandre DERUMIER
Post by Dietmar Maurer
Can't we simply increase that max_mem_regions default value?
I'm trying to see why they have limited it to 64.
maybe a qemu bug with previous qemu version? don't known.
I have some big spare servers to test with 370G ram, i'll try to do test with them.
----- Mail original -----
Envoyé: Lundi 12 Décembre 2016 17:09:41
Objet: Re: [pve-devel] qemu memory hotplug limited to 64 dimm
Can't we simply increase that max_mem_regions default value?
Post by Alexandre DERUMIER
vhost 64 limit has been introduced here, because of qemu crash with bigger value
https://patchwork.kernel.org/patch/6709021/
----- Mail original -----
Envoyé: Lundi 12 Décembre 2016 16:51:11
Objet: [pve-devel] qemu memory hotplug limited to 64 dimm
Hi,
a proxmox user has reported problem to hotplug memory with high values (48G)
https://forum.proxmox.com/threads/apt-get-upgrade-to-update-proxmox-server.30992/
Seem that qemu check about a vhost value
https://lists.nongnu.org/archive/html/qemu-devel/2015-07/msg05998.html
cat /sys/module/vhost/parameters/max_mem_regions > 64 (verified on my servers).
I think this limit was not present when we have implemented hotplug.
Not sure was we can do, but I think our method don't scale anymore :(
Maybe should we introduce a new fixed "dimm size" option, to allow hotplug
of
more memory ?
_______________________________________________
pve-devel mailing list
http://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
_______________________________________________
pve-devel mailing list
http://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
_______________________________________________
pve-devel mailing list
http://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
_______________________________________________
pve-devel mailing list
http://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
_______________________________________________
pve-devel mailing list
http://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
Wolfgang Bumiller
2016-12-13 15:29:23 UTC
Permalink
Post by Dietmar Maurer
Post by Alexandre DERUMIER
I have asked to user to modprobe vhost module with region at 509, seem to works fine.
maybe can we force it in kernel directly ?
drivers/vhost/vhost.c:static ushort max_mem_regions = 64;
What value should we set exactly?
Any drawbacks?
It's a hack... I find the idea of configuring the dimm size more
appealing. Why would you want a huge amount of tiny DIMMs in a VM,
especially if it means kernel-patching... ? :|
Dietmar Maurer
2016-12-13 15:37:41 UTC
Permalink
Post by Wolfgang Bumiller
Post by Dietmar Maurer
Any drawbacks?
It's a hack... I find the idea of configuring the dimm size more
appealing. Why would you want a huge amount of tiny DIMMs in a VM,
Because it works out of the box (for the users) ...
Post by Wolfgang Bumiller
especially if it means kernel-patching... ? :|
Why is that patch not applied to upstream? Maybe we can ask on the kernel list?
Alexandre DERUMIER
2016-12-13 18:34:07 UTC
Permalink
Post by Wolfgang Bumiller
Post by Dietmar Maurer
Why is that patch not applied to upstream? Maybe we can ask on the kernel list?
From what I have read, they are an performance impact with more than 64 dimm,
and until they have fixed it, they don't want to increase value

https://patchwork.kernel.org/patch/5825741/
"I'm almost done with approach suggested by Paolo,
i.e. replace linear search with faster/scalable lookup alg.
Hope to post it soon.
"

We can simply add a file:
/etc/modprobe.d/vhost.conf
options vhost max_mem_regions=509


But, as Wolfgang, I also think that be able to define optional fixed dimm size could be great too.

----- Mail original -----
De: "dietmar" <***@proxmox.com>
À: "Wolfgang Bumiller" <***@proxmox.com>
Cc: "aderumier" <***@odiso.com>, "pve-devel" <pve-***@pve.proxmox.com>
Envoyé: Mardi 13 Décembre 2016 16:37:41
Objet: Re: [pve-devel] qemu memory hotplug limited to 64 dimm
Post by Wolfgang Bumiller
Post by Dietmar Maurer
Any drawbacks?
It's a hack... I find the idea of configuring the dimm size more
appealing. Why would you want a huge amount of tiny DIMMs in a VM,
Because it works out of the box (for the users) ...
Post by Wolfgang Bumiller
especially if it means kernel-patching... ? :|
Why is that patch not applied to upstream? Maybe we can ask on the kernel list?
Wolfgang Bumiller
2016-12-14 14:07:16 UTC
Permalink
So I see a few options one or more of which we should consider:

1) Make the *static* memory size configurable:

It doesn't make sense to me to eg. want to go from 1G to 200G in
small steps. Worse is the fact that even when starting with a higher
amount knowing that I'm only going to increase it, I start with lots and
lots of tiny dimms. Eg. 8G = 1G static + 14 x 512M dimms.

This could be added as a separate option or even to the existing
`memory` option without breaking existing configs (by turning it into a
property string with the current amount as the default key.)
I could then do:

memory: 8192,static=8192

This example frees up 14 dimm slots alone.
The downside is this VM cannot go below 8G without a reboot.

The dimm size should then probably not start at 512M either. That's
where the next change comes in:

2) Change the steps one way or another:

a) Generally change the number of steps before increasing the step
size:
The current 32 makes sense for the first few gigs, but if we eg. have
a machine with 101G memory, we'd still be doing 2G steps. That
amounts to a 2% granularity. At 113G we bump this up to 4G steps,
which is a 3.5% granularity.
I propose we use no more than 16 dimms of each size. (Heck I'd even
be fine with 8 personally, although 14 of the 512M dimms would be
good to get to a "nice" 8G in small steps. ;-) )

This change is not backward compatible and would break memory
hot-unplugging of running machines after the upgrade.
Or well, it would still "happen", but our count would be out of
sync.

b) Make the _initial_ dimm size configurable:
This would work well in combination with (1). (This initial dimm
size would not be hot-pluggable as that would throw off our
counter.)

c) Have the option to use only 1 fixed dimm size.
Since we already have a way to get a list of a running VM's dimms,
this option could even be hot-pluggable. The unplug code would have
to be changed to fetch the list of dimms from qemu rather than
assume our fixed dimm list is used.
[*] This would probably be a good idea in any case ;-)
Of course this would allow the user to add a 16G dimm, change the
dimm size to 1G and then try to reduce memory by 1 and fail.
(I see no reason to protect people from too many of their own
mistakes, but we *could* of course limit the hot-plugging of the
dimm size to increasing only...)

c) Use an explicit dimm list.
This would be an alternative to (1), you define the static memory
and the dimms, the latter can be hot plugged:

memory: 8192,dimms=512x4:1024x4
or
memory: 8192
dimms: 512x4 1024x4

Would be the equivalent of 8G static memory plus 4 dimms of 512M
and 1G each - a total of 14G.

For backward compatibility we could require dimms to be "defined
but empty" to use the new method, and keep the old method
otherwise.

Downside: who wants to make a GUI for that? ;-)

3) Drop support for hot*UN*plugging altogether and simply always hot-add
one big dimm. ;-)
On that note, my guests currently hang when I unplug memory and I bet
that happens to all of you and nobody ever noticed because nobody needs
that feature anyway ;-)
Alexandre DERUMIER
2016-12-14 14:45:17 UTC
Permalink
Post by Wolfgang Bumiller
3) Drop support for hot*UN*plugging altogether and simply always hot-add
one big dimm. ;-)
On that note, my guests currently hang when I unplug memory and I bet
that happens to all of you and nobody ever noticed because nobody needs
that feature anyway ;-)
yes, it's a linux guest limitation. (qemu implementation is ok), but linux kernel memory is not movable currently, and if it's located on a dimm, you can't unplug it, as the kernel memory can't be move at another location.




----- Mail original -----
De: "Wolfgang Bumiller" <***@proxmox.com>
À: "Alexandre Derumier" <***@odiso.com>
Cc: "dietmar" <***@proxmox.com>, "pve-devel" <pve-***@pve.proxmox.com>
Envoyé: Mercredi 14 Décembre 2016 15:07:16
Objet: Re: [pve-devel] qemu memory hotplug limited to 64 dimm

So I see a few options one or more of which we should consider:

1) Make the *static* memory size configurable:

It doesn't make sense to me to eg. want to go from 1G to 200G in
small steps. Worse is the fact that even when starting with a higher
amount knowing that I'm only going to increase it, I start with lots and
lots of tiny dimms. Eg. 8G = 1G static + 14 x 512M dimms.

This could be added as a separate option or even to the existing
`memory` option without breaking existing configs (by turning it into a
property string with the current amount as the default key.)
I could then do:

memory: 8192,static=8192

This example frees up 14 dimm slots alone.
The downside is this VM cannot go below 8G without a reboot.

The dimm size should then probably not start at 512M either. That's
where the next change comes in:

2) Change the steps one way or another:

a) Generally change the number of steps before increasing the step
size:
The current 32 makes sense for the first few gigs, but if we eg. have
a machine with 101G memory, we'd still be doing 2G steps. That
amounts to a 2% granularity. At 113G we bump this up to 4G steps,
which is a 3.5% granularity.
I propose we use no more than 16 dimms of each size. (Heck I'd even
be fine with 8 personally, although 14 of the 512M dimms would be
good to get to a "nice" 8G in small steps. ;-) )

This change is not backward compatible and would break memory
hot-unplugging of running machines after the upgrade.
Or well, it would still "happen", but our count would be out of
sync.

b) Make the _initial_ dimm size configurable:
This would work well in combination with (1). (This initial dimm
size would not be hot-pluggable as that would throw off our
counter.)

c) Have the option to use only 1 fixed dimm size.
Since we already have a way to get a list of a running VM's dimms,
this option could even be hot-pluggable. The unplug code would have
to be changed to fetch the list of dimms from qemu rather than
assume our fixed dimm list is used.
[*] This would probably be a good idea in any case ;-)
Of course this would allow the user to add a 16G dimm, change the
dimm size to 1G and then try to reduce memory by 1 and fail.
(I see no reason to protect people from too many of their own
mistakes, but we *could* of course limit the hot-plugging of the
dimm size to increasing only...)

c) Use an explicit dimm list.
This would be an alternative to (1), you define the static memory
and the dimms, the latter can be hot plugged:

memory: 8192,dimms=512x4:1024x4
or
memory: 8192
dimms: 512x4 1024x4

Would be the equivalent of 8G static memory plus 4 dimms of 512M
and 1G each - a total of 14G.

For backward compatibility we could require dimms to be "defined
but empty" to use the new method, and keep the old method
otherwise.

Downside: who wants to make a GUI for that? ;-)

3) Drop support for hot*UN*plugging altogether and simply always hot-add
one big dimm. ;-)
On that note, my guests currently hang when I unplug memory and I bet
that happens to all of you and nobody ever noticed because nobody needs
that feature anyway ;-)
Wolfgang Bumiller
2016-12-14 15:44:52 UTC
Permalink
Post by Alexandre DERUMIER
Post by Wolfgang Bumiller
3) Drop support for hot*UN*plugging altogether and simply always hot-add
one big dimm. ;-)
On that note, my guests currently hang when I unplug memory and I bet
that happens to all of you and nobody ever noticed because nobody needs
that feature anyway ;-)
yes, it's a linux guest limitation. (qemu implementation is ok), but linux kernel memory is not movable currently, and if it's located on a dimm, you can't unplug it, as the kernel memory can't be move at another location.
Apparently I do have to explicitly mention that I'm talking about *unused* memory, iow. memory where the /sys/../enabled file is 0. I think this was working at some point on modern guest kernels?
Alexandre DERUMIER
2016-12-14 16:05:20 UTC
Permalink
Post by Alexandre DERUMIER
Apparently I do have to explicitly mention that I'm talking about *unused* memory, iow. memory where the /sys/../enabled file is 0. I think this was working at some point on modern guest >>kernels?
do you mean "online" instead "enabled" ?

also here some notes I have set in the wiki



online should be set by an udev rules, to auto online dimm on hotplug

/lib/udev/rules.d/80-hotplug-cpu-mem.rules
SUBSYSTEM=="cpu", ACTION=="add", TEST=="online", ATTR{online}=="0", ATTR{online}="1"
SUBSYSTEM=="memory", ACTION=="add", TEST=="state", ATTR{state}=="offline", ATTR{state}="online"

for linux kernel >= 4.7,
you don't need the udev rule for memory hotplug, you only need to add this kernel parameter at boot:

memhp_default_state=online

Memory hot-unplug
- Memory unplug don't work windows currently (<=win10 currently)
- Memory unplug can't be unstable on linux (<= kernel 4.8 currently)
for Linux memory unplug, you need have movable zone enabled in kernel config (not enabled by default on debian/ubuntu)
CONFIG_MOVABLE_NODE=YES
and
"movable_node" boot kernel parameter enabled



----- Mail original -----
De: "Wolfgang Bumiller" <***@proxmox.com>
À: "Alexandre Derumier" <***@odiso.com>
Cc: "pve-devel" <pve-***@pve.proxmox.com>, "dietmar" <***@proxmox.com>
Envoyé: Mercredi 14 Décembre 2016 16:44:52
Objet: Re: [pve-devel] qemu memory hotplug limited to 64 dimm
Post by Alexandre DERUMIER
Post by Wolfgang Bumiller
3) Drop support for hot*UN*plugging altogether and simply always hot-add
one big dimm. ;-)
On that note, my guests currently hang when I unplug memory and I bet
that happens to all of you and nobody ever noticed because nobody needs
that feature anyway ;-)
yes, it's a linux guest limitation. (qemu implementation is ok), but linux kernel memory is not movable currently, and if it's located on a dimm, you can't unplug it, as the kernel memory can't be move at another location.
Apparently I do have to explicitly mention that I'm talking about *unused* memory, iow. memory where the /sys/../enabled file is 0. I think this was working at some point on modern guest kernels?
Alexandre DERUMIER
2016-12-14 16:13:02 UTC
Permalink
also see this redhat bugzilla, about kernel bug (even with online_movable)

https://bugzilla.redhat.com/show_bug.cgi?id=1320534

"
Description Milan Zamazal 2016-03-23 08:38:07 EDT
Description of problem:

File /usr/lib/udev/rules.d/40-redhat.rules contains the following line:

SUBSYSTEM=="memory", ACTION=="add", ATTR{state}=="offline", ATTR{state}="online"

This ensures that hotplugged memory gets available to the system. In order to be able to hotunplug the hotplugged memory later, its state should be set to `online_movable' instead of `online'. Otherwise the system may allocate kernel pages in the hotplugged memory, preventing its removal. To avoid this problem the rule should look like

SUBSYSTEM=="memory", ACTION=="add", ATTR{state}=="offline", ATTR{state}="online_movable"

However, there is a kernel bug that prevents this simple change from working, see https://bugzilla.redhat.com/1314306. It's not possible to set memory blocks as online_movable in random order (it results in "Permission denied" failures), the state must be changed from the highest numbered plugged memory block to the lowest one. So either the kernel bug should be fixed before the change, or the udev rules should ensure changing the state of the memory blocks to `online_movable' in the proper order (from the highest to the lowest memory block).

Version-Release number of selected component (if applicable):

219-19.el7_2.4.x86_64

"



----- Mail original -----
De: "Alexandre Derumier" <***@odiso.com>
À: "Wolfgang Bumiller" <***@proxmox.com>
Cc: "pve-devel" <pve-***@pve.proxmox.com>
Envoyé: Mercredi 14 Décembre 2016 17:05:20
Objet: Re: [pve-devel] qemu memory hotplug limited to 64 dimm
Post by Alexandre DERUMIER
Apparently I do have to explicitly mention that I'm talking about *unused* memory, iow. memory where the /sys/../enabled file is 0. I think this was working at some point on modern guest >>kernels?
do you mean "online" instead "enabled" ?

also here some notes I have set in the wiki



online should be set by an udev rules, to auto online dimm on hotplug

/lib/udev/rules.d/80-hotplug-cpu-mem.rules
SUBSYSTEM=="cpu", ACTION=="add", TEST=="online", ATTR{online}=="0", ATTR{online}="1"
SUBSYSTEM=="memory", ACTION=="add", TEST=="state", ATTR{state}=="offline", ATTR{state}="online"

for linux kernel >= 4.7,
you don't need the udev rule for memory hotplug, you only need to add this kernel parameter at boot:

memhp_default_state=online

Memory hot-unplug
- Memory unplug don't work windows currently (<=win10 currently)
- Memory unplug can't be unstable on linux (<= kernel 4.8 currently)
for Linux memory unplug, you need have movable zone enabled in kernel config (not enabled by default on debian/ubuntu)
CONFIG_MOVABLE_NODE=YES
and
"movable_node" boot kernel parameter enabled



----- Mail original -----
De: "Wolfgang Bumiller" <***@proxmox.com>
À: "Alexandre Derumier" <***@odiso.com>
Cc: "pve-devel" <pve-***@pve.proxmox.com>, "dietmar" <***@proxmox.com>
Envoyé: Mercredi 14 Décembre 2016 16:44:52
Objet: Re: [pve-devel] qemu memory hotplug limited to 64 dimm
Post by Alexandre DERUMIER
Post by Wolfgang Bumiller
3) Drop support for hot*UN*plugging altogether and simply always hot-add
one big dimm. ;-)
On that note, my guests currently hang when I unplug memory and I bet
that happens to all of you and nobody ever noticed because nobody needs
that feature anyway ;-)
yes, it's a linux guest limitation. (qemu implementation is ok), but linux kernel memory is not movable currently, and if it's located on a dimm, you can't unplug it, as the kernel memory can't be move at another location.
Apparently I do have to explicitly mention that I'm talking about *unused* memory, iow. memory where the /sys/../enabled file is 0. I think this was working at some point on modern guest kernels?

_______________________________________________
pve-devel mailing list
pve-***@pve.proxmox.com
http://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-devel

Dietmar Maurer
2016-12-13 15:29:52 UTC
Permalink
Post by Alexandre DERUMIER
I have asked to user to modprobe vhost module with region at 509, seem to works fine.
maybe can we force it in kernel directly ?
Oh, are all 5 patches missing from current kernel? Or do we only need to
increase
the value?
Loading...