为什么不能重载&&与||以及,(comma)?

C++的基础语法里提供了||&&两个逻辑操作符还有,(comma)运算符。在类中我们也可以重载这些操作符,但是不要这样做,我会在这篇文章中写出标准描述以及不能重载的原因。
概括来说,因为内置的||和&&具有短路求值语义,如果你自己重载了他们就变成了普通的函数调用,会具有与built-in ||&&完全不同的语义。
而,操作符具有从左到右求值的语义,所以如果自己重载,会变成函数调用,也会具有不同于built-in的语义。

先来简单介绍一下C++中的逻辑运算符&&||.但是这么基础的语法懒得介绍用法了,直接贴标准算了(ISO/IEC 14882:2014):

logical AND operator:The && operator groups left-to-right. The operands are both contextually converted to bool (Clause 4). The result is true if both operands are true and false otherwise. Unlike &, && guarantees left-to-right evaluation: the second operand is not evaluated if the first operand is false.
The result is a bool. If the second expression is evaluated, every value computation and side effect associated with the first expression is sequenced before every value computation and side effect associated with the second expression.

logical OR operator:The || operator groups left-to-right. The operands are both contextually converted to bool (Clause 4). It returns true if either of its operands is true, and false otherwise. Unlike |, || guarantees left-to-right evaluation; moreover, the second operand is not evaluated if the first operand evaluates to true.
The result is a bool. If the second expression is evaluated, every value computation and side effect associated with the first expression is sequenced before every value computation and side effect associated with the second expression.

其实关于“为什么不能重载operator&&与operator||”最重要的就是以下两点:

  • AND:If the second expression is evaluated, every value computation and side effect associated with the first expression is sequenced before every value computation and side effect associated with the second expression.
  • OR:If the second expression is evaluated, every value computation and side effect associated with the first expression is sequenced before every value computation and side effect associated with the second expression.

这意味着&&||具有短路求值性质:The operators && and || will not evaluate their second argument unless doing so is necessary.(TC++PL4th)

但是,如果重载了&&||,则他们就变成了函数。所有实参在传递给形参的时候会被求值:

[ISO/IEC 14882:2014]All side effects of argument evaluations are sequenced before the function is entered.

而且,C++标准也明确说明了函数参数的求值顺序是不确定的。

[ISO/IEC 14882:2014 §8.3.6.9]The order of evaluation of function arguments is unspecified.

即,函数的实参在进入函数体之前副作用会被执行,也就违背了逻辑操作的短路求值性质,与built-in的语义有所冲突,所以不能重载&&||

看一下下面这个例子:

1
2
3
4
5
6
7
8
int main()
{
int a=0;
int b=0;
++a||++b;
printf("%d\n",b);
}
// output: 0

因为++a已经为true所以++b就不会再执行了,这也既是短路求值原则。
再看下面的代码:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
struct A{
int ival=0;
A& operator++(){
ival++;
return *this;
}
template<typename U>
bool operator||(U& x){
if(ival!=0||x)
return true;
else
return false;
}
};
int main()
{
A aobj;
int b=0;
++aobj||++b;
printf("%d\n",b);
}
// output: 1

这里在类A内重载了||,所以++aobj||++b;就变成了函数调用,来看其LLVM-IR代码:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
define i32 @main() #4 {
%1 = alloca %struct.A, align 4
%2 = alloca i32, align 4
call void @_ZN1AC2Ev(%struct.A* %1) #3
store i32 0, i32* %2, align 4
%3 = call dereferenceable(4) %struct.A* @_ZN1AppEv(%struct.A* %1)
%4 = load i32, i32* %2, align 4
%5 = add nsw i32 %4, 1
store i32 %5, i32* %2, align 4
%6 = call zeroext i1 @_ZN1AooIiEEbRT_(%struct.A* %3, i32* dereferenceable(4) %2)
%7 = load i32, i32* %2, align 4
%8 = call i32 (i8*, ...) @_ZL6printfPKcz(i8* getelementptr inbounds ([4 x i8], [4 x i8]* @.str, i32 0, i32 0), i32 %7)
ret i32 0
}

请着重看这几行,这部分是++aobj||++b;这一行所执行的操作:

1
2
3
4
5
%3 = call dereferenceable(4) %struct.A* @_ZN1AppEv(%struct.A* %1)
%4 = load i32, i32* %2, align 4
%5 = add nsw i32 %4, 1
store i32 %5, i32* %2, align 4
%6 = call zeroext i1 @_ZN1AooIiEEbRT_(%struct.A* %3, i32* dereferenceable(4) %2)

可能你会对_ZN1AppEv_ZN1AooIiEEbRT_这样的名字产生疑虑,看起来像是函数调用,但是又分辨不出是哪一个函数调用,这部分的内容我详细地写在了我的另两篇文章中:C/C++编译和链接模型分析为什么需要extern “C”?
我们先直接来看一下_ZN1AppEv_ZN1AooIiEEbRT_这两个符号所代表的信息:

1
2
3
4
$ c++filt _ZN1AppEv
A::operator++()
$ c++filt _ZN1AooIiEEbRT_
bool A::operator||<int>(int&)

根据两个符号在IR代码中的顺序可以知道,++aobj||++b;所执行的顺序为:

  1. 先自增aobj
  2. 自增b
  3. 然后调用A::operator||

这意味着不论++aobj的结果如何++b都会被执行,哪还有什么短路求值原则!而&&也是同理。

下面来介绍为什么不能重载operator,
在C/C++中,,运算符会从左到右地进行求值操作:

[ISO/IEC 14882:2014]A pair of expressions separated by a comma is evaluated left-to-right; the left expression is a discarded-value expression.

所以,如果重载operator,会具有两个问题:

  1. 实参在进入函数体之前就已经求值完毕
  2. 重载operator,的求值顺序是不一定的

与built-in的,操作语义上有冲突,所以不能重载operator,.

考虑下面的代码:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
struct A{
int ival=0;
A& operator++(int){
ival++;
return *this;
}
template<typename T>
T operator,(T x){
return x;
}
};
int main()
{
A aobj;
int b=0;
aobj++,b=aobj.ival;
}

这里的aobj++,b=aobj.ival;会调用类A的operator,,即这是一个函数调用。
既然是函数调用就会遵循C++中对函数参数的求值原则:The order of evaluation of function arguments is unspecified.
未指定的,表示不会按照某种特定的顺序执行,所以在重载的operator,中,右边的参数依赖左边的参数是未定义的行为。
不过我在主流的编译器中测试(GCC6.2/Clang 3.9)中测试均是按照参数列表的顺序来执行的,但这在C++标准中并无保证。
来看一下上面主函数的LLVM-IR代码:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
define i32 @main() #4 {
%1 = alloca i32, align 4
%2 = alloca %struct.A, align 4
%3 = alloca i32, align 4
store i32 0, i32* %1, align 4
call void @_ZN1AC2Ev(%struct.A* %2) #3
store i32 0, i32* %3, align 4
%4 = call dereferenceable(4) %struct.A* @_ZN1AppEi(%struct.A* %2, i32 0)
%5 = getelementptr inbounds %struct.A, %struct.A* %2, i32 0, i32 0
%6 = load i32, i32* %5, align 4
store i32 %6, i32* %3, align 4
%7 = call i32 @_ZN1AcmIiEET_S1_(%struct.A* %4, i32 %6)
ret i32 0
}

编译器对aobj++,b=aobj.ival;所执行的行为就是先执行aobj的operator++(int)操作,再执行b=aobj.ival的操作。
总的来说,虽然主流的C++编译器会实现这样的(类似built-in)行为,但是从C++标准来说,函数调用和built-in的operator,是有区别的。

全文完,若有不足之处请评论指正。

微信扫描二维码,关注我的公众号。

本文标题:为什么不能重载&&与||以及,(comma)?
文章作者:查利鹏
发布时间:2017年06月24日 22时21分
本文字数:本文一共有3.3k字
原始链接:https://imzlp.com/posts/11306/
许可协议: CC BY-NC-SA 4.0
文章禁止全文转载,摘要转发请保留原文链接及作者信息,谢谢!
您的捐赠将鼓励我继续创作!